llvm-project

Author	SHA1	Message	Date
Heejin Ahn	a8e1135baa	[WebAssembly] Add -wasm-use-legacy-eh option (#122158 ) This replaces the existing `-wasm-enable-exnref` with `-wasm-use-legacy-eh` option, in an effort to make the new standardized exnref proposal the 'default' state and the legacy proposal needs to be separately enabled an option. But given that most users haven't switched to the new proposal and major web browsers haven't turned it on by default, this `-wasm-use-legacy-eh` is turned on by default, so nothing will change for now for the functionality perspective. This also removes the restriction that `-wasm-enable-exnref` be only used with `-wasm-enable-eh` because this option is enabled by default. This option does not have any effect when `-wasm-enable-eh` is not used.	2025-01-09 22:36:10 -08:00
Dan Gohman	c5ab70c508	[WebAssembly] Add `-i128:128` to the `datalayout` string. (#119204 ) Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM `datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is currently using the defaults for both, so it's inconsistent. Fix this by adding `-i128:128` to Wasm's `datalayout` string so that it aligns `i128` to 16 bytes too. This is similar to [llvm/llvm-project@dbad963](`dbad963a69`) for SPARC. This fixes rust-lang/rust#133991; see that issue for further discussion. [defaults to aligning `__int128_t` to 16 bytes]: `f8b4182f07/clang/lib/Basic/TargetInfo.cpp (L77)` [default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout	2024-12-10 09:21:58 -08:00
Dan Gohman	e665e781dc	[SelectionDAG] Use the nuw flag when expanding loads. (#119288 ) When expanding a load into two loads, use nuw for the add that computes the offset from the base of the second load, because the original load doesn't straddle the address space. It turns out there's already a dedicated helper function for doing this, `getObjectPtrOffset`. This is in target-independent code, however in practice it only seems to affact WebAssembly code, because WebAssembly load and store instructions' constant offsets don't perform wrapping, so constant folding often depends on the nuw flag being present. This was noticed in the development of #119204.	2024-12-10 06:28:09 -08:00
Dan Gohman	35cce408ee	[WebAssembly] Support the new "Lime1" CPU (#112035 ) This adds WebAssembly support for the new [Lime1 CPU]. First, this defines some new target features. These are subsets of existing features that reflect implementation concerns: - "call-indirect-overlong" - implied by "reference-types"; just the overlong encoding for the `call_indirect` immediate, and not the actual reference types. - "bulk-memory-opt" - implied by "bulk-memory": just `memory.copy` and `memory.fill`, and not the other instructions in the bulk-memory proposal. Next, this defines a new target CPU, "lime1", which enables mutable-globals, bulk-memory-opt, multivalue, sign-ext, nontrapping-fptoint, extended-const, and call-indirect-overlong. Unlike the default "generic" CPU, "lime1" is meant to be frozen, and followed up by "lime2" and so on when new features are desired. [Lime1 CPU]: https://github.com/WebAssembly/tool-conventions/blob/main/Lime.md#lime1 --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-12-03 16:35:23 -08:00
Dan Gohman	c3536b263f	[WebAssembly] Define call-indirect-overlong and bulk-memory-opt features (#117087 ) This defines some new target features. These are subsets of existing features that reflect implementation concerns: - "call-indirect-overlong" - implied by "reference-types"; just the overlong encoding for the `call_indirect` immediate, and not the actual reference types. - "bulk-memory-opt" - implied by "bulk-memory": just `memory.copy` and `memory.fill`, and not the other instructions in the bulk-memory proposal. This is split out from https://github.com/llvm/llvm-project/pull/112035. --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-12-02 17:08:07 -08:00
Sam Clegg	ea58410d0f	[WebAssembly] Implement %llvm.thread.pointer intrinsic (#117817 ) We can simply use the `__tls_base` global for this which is guaranteed to be non-zero and unique per thread. Fixes: #117433	2024-11-26 17:19:14 -08:00
Heejin Ahn	492812f613	[WebAssembly] Fix rethrow's index calculation (#114693 ) So far we have assumed that we only rethrow the exception caught in the innermost EH pad. This is true in code we directly generate, but after inlining this may not be the case. For example, consider this code: ```ll ehcleanup: %0 = cleanuppad ... call @destructor cleanupret from %0 unwind label %catch.dispatch ``` If `destructor` gets inlined into this function, the code can be like ```ll ehcleanup: %0 = cleanuppad ... invoke @throwing_func to label %unreachale unwind label %catch.dispatch.i catch.dispatch.i: catchswitch ... [ label %catch.start.i ] catch.start.i: %1 = catchpad ... invoke @some_function to label %invoke.cont.i unwind label %terminate.i invoke.cont.i: catchret from %1 to label %destructor.exit destructor.exit: cleanupret from %0 unwind label %catch.dispatch ``` We lower a `cleanupret` into `rethrow`, which assumes it rethrows the exception caught by the nearest dominating EH pad. But after the inlining, the nearest dominating EH pad is not `ehcleanup` but `catch.start.i`. The problem exists in the same manner in the new (exnref) EH, because it assumes the exception comes from the nearest EH pad and saves an exnref from that EH pad and rethrows it (using `throw_ref`). This problem can be fixed easily if `cleanupret` has the basic block where its matching `cleanuppad` is. The bitcode instruction `cleanupret` kind of has that info (it has a token from the `cleanuppad`), but that info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s `cleanupret` node does not have any arguments: `5091a359d9/llvm/include/llvm/Target/TargetSelectionDAG.td (L700)` Note that `catchret` already has two basic block arguments, even though neither of them means `catchpad`'s BB. This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node in ISel and uses it in the Wasm backend. Because this node is also used in X86 backend we need to note its argument there too but nothing more needs to change there as long as X86 doesn't need it. --- - Details about changes in the Wasm backend: After this PR, our pseudo `RETHROW` instruction takes a BB, which means the EH pad whose exception it needs to rethrow. There are currently two ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic and the other is from `CLEANUPRET` we discussed above. In case of `llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As written in the comments, this PR doesn't change how this BB is computed. The BB argument will be converted to an immediate argument as with other control flow instructions in CFGStackify. In case of `CLEANUPRET`, it already has a BB argument pointing to an EH pad, so it is just converted to a `RETHROW` with the same BB argument in LateEHPrepare. This will also be lowered to an immediate in CFGStackify with other control flow instructions. --- Fixes #114600.	2024-11-05 21:45:13 -08:00
Heejin Ahn	380fd09d98	[WebAssembly] Fix unwind mismatches in new EH (#114361 ) This fixes unwind mismatches for the new EH spec. The main flow is similar to that of the legacy EH's unwind mismatch fixing. The new EH shared `fixCallUnwindMismatches` and `fixCatchUnwindMismatches` functions, which gather the range of instructions we need to fix their unwind destination for, with the legacy EH. But unlike the legacy EH that uses `try`-`delegate`s to fix them, the new EH wrap those instructions with nested `try_table`-`end_try_table`s that jump to a "trampoline" BB, where we rethrow (using a `throw_ref`) the exception to the correct `try_table`. For a simple example of a call unwind mismatch, suppose if `call foo` should unwind to the outer `try_table` but is wrapped in another `try_table` (not shown here): ```wast try_table ... call foo ;; Unwind mismatch. Should unwind to the outer try_table ... end_try_table ``` Then we wrap the call with a new nested `try_table`-`end_try_table`, add a `block` / `end_block` right inside the target `try_table`, and make the nested `try_table` jump to it using a `catch_all_ref` clause, and rethrow the exception using a `throw_ref`: ```wast try_table block $l0 exnref ... try_table (catch_all_ref $l0) call foo end_try_table ... end_block ;; Trampoline BB throw_ref end_try_table ``` --- This fixes two existing bugs. These are not easy to test independently without the unwind mismatch fixing. The first one is how we calculate `ScopeTops`. Turns out, we should do it in the same way as in the legacy EH even though there is no `end_try` at the end of `catch` block anymore. `nested_try` in `cfg-stackify-eh.ll` tests this case. The second bug is in `rewriteDepthImmediates`. `try_table`'s immediates should be computed without the `try_table` itself, meaning ```wast block try_table (catch ... 0) end_try_table end_block ``` Here 0 should target not `end_try_table` but `end_block`. This bug didn't crash the program because `placeTryTableMarker` generated only the simple form of `try_table` that has a single catch clause and an `end_block` follows right after the `end_try_table` in the same BB, so jumping to an `end_try_table` is the same as jumping to the `end_block`. But now we generate `catch` clauses with depths greater than 0 with when fixing unwind mismatches, which uncovered this bug. --- One case that needs a special treatment was when `end_loop` precedes an `end_try_table` within a BB and this BB is a (true) unwind destination when fixing unwind mismatches. In this case we need to split this `end_loop` into a predecessor BB. This case is tested in `unwind_mismatches_with_loop` in `cfg-stackify-eh.ll`. --- `cfg-stackify-eh.ll` contains mostly the same set of tests with the existing `cfg-stackify-eh-legacy.ll` with the updated FileCheck expectations. As in `cfg-stackify-eh-legacy.ll`, the FileCheck lines mostly only contain control flow instructions and calls for readability. - `nested_try` and `unwind_mismatches_with_loop` are added to test newly found bugs in the new EH. - Some tests in `cfg-stackify-eh-legacy.ll` about the legacy-EH-specific asepcts have not been added to `cfg-stackify-eh.ll`. (`remove_unnecessary_instrs`, `remove_unnecessary_br`, `fix_function_end_return_type_with_try_catch`, and `branch_remapping_after_fixing_unwind_mismatches_0/1`)	2024-11-05 09:40:41 -08:00
Heejin Ahn	d1d3e4b273	[WebAssembly] Add/Reorder legacy EH tests (#114363 ) These tests are added to match the standard EH tests in #114361: - `nested_try` - `unwind_mismatches_with_loop` These tests are useful to test certain aspects of the new EH but I think they add more coverage to the legaacy tests as well. And `unstackify_when_fixing_unwind_mismatch` and `unwind_mismatches_5` have not changed; they have been just moved. This also fixes some comments.	2024-11-04 16:07:52 -08:00
Dan Gohman	1bc2cd98c5	[WebAssembly] Enable nontrapping-fptoint and bulk-memory by default. (#112049 ) We were prepared to enable these features [back in February], but they got pulled for what appear to be unrelated reasons. So let's have another try at enabling them! Another motivation here is that it'd be convenient for the [Lime1 proposal] if "lime1" is close to a subset of "generic" (missing only for extended-const). [back in February]: https://github.com/WebAssembly/tool-conventions/issues/158#issuecomment-1931119512 [Lime1 proposal]: https://github.com/llvm/llvm-project/pull/112035	2024-10-25 13:52:51 -07:00
Dan Gohman	118445841d	[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. (#112617 ) WebAssembly's `memory.fill` and `memory.copy` instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call `memcpy` on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around `memory.fill` and `memory.copy` when the length is zero. --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-10-24 14:13:58 -07:00
Alex Crichton	c2293b33dd	[WebAssembly] Implement the wide-arithmetic proposal (#111598 ) This commit implements the [wide-arithmetic] proposal which has recently reached phase 2 in the WebAssembly proposals process. The goal here is to implement support in LLVM for emitting these instructions which are gated behind a new feature flag by default. A new `wide-arithmetic` feature flag is introduced which gates these four new instructions from being emitted. Emission of each instruction itself is relatively simple given LLVM's preexisting lowering rules and infrastructure. The main gotcha is that due to the multi-result nature of all of these instructions it needed the lowerings to be implemented in C++ rather than in TableGen. [wide-arithmetic]: https://github.com/WebAssembly/wide-arithmetic	2024-10-23 11:39:58 -07:00
Heejin Ahn	5c92f2331c	[WebAssembly] Fix MIR printing of reference types (#113028 ) When printing a memory operand in MIR, this line `d37bc32a65/llvm/lib/CodeGen/MachineOperand.cpp (L1247)` calls this `d37bc32a65/llvm/include/llvm/Support/Alignment.h (L238)` which assumes `Rhs` (the size in this case) is positive. But Wasm reference types' size is set to 0: `d37bc32a65/llvm/include/llvm/CodeGen/ValueTypes.td (L326-L328)` `getSize() > 0` condition was added with the Wasm reference types support in `46667a1003`, and it looks it was removed in #84751. This revives the condition so that Wasm reference types will not crash the MIR printer.	2024-10-22 13:48:00 -07:00
Alex Rønne Petersen	ad4a582fd9	[llvm] Consistently respect `naked` fn attribute in `TargetFrameLowering::hasFP()` (#106014 ) Some targets (e.g. PPC and Hexagon) already did this. I think it's best to do this consistently so that frontend authors don't run into inconsistent results when they emit `naked` functions. For example, in Zig, we had to change our emit code to also set `frame-pointer=none` to get reliable results across targets. Note: I don't have commit access.	2024-10-18 09:35:42 +04:00
Tex Riddell	2bebeea2a1	[WebAssembly] Add atan2 to RuntimeLibcallSignatureTable (#112613 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - `WebAssemblyRuntimeLibcallSignatures.cpp`: Add `RTLIB::ATAN2*` to RuntimeLibcallSignatureTable - Add atan2 calls to `CodeGen/WebAssembly/libcalls-trig.ll` and update test checks Part of: Implement the atan2 HLSL Function #70096.	2024-10-17 10:39:36 -07:00
Yuta Saito	d4efc3e097	[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332 ) Currently, WebAssembly/WASI target does not provide direct support for code coverage. This patch set fixes several issues to unlock the feature. The main changes are: 1. Port `compiler-rt/lib/profile` to WebAssembly/WASI. 2. Adjust profile metadata sections for Wasm object file format. - [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections instead of data segments. - [lld] Align the interval space of custom sections at link time. - [llvm-cov] Copy misaligned custom section data if the start address is not aligned. - [llvm-cov] Read `__llvm_prf_names` from data segments 3. [clang] Link with profile runtime libraries if requested See each commit message for more details and rationale. This is part of the effort to add code coverage support in Wasm target of Swift toolchain.	2024-10-15 02:41:43 +09:00
Heejin Ahn	115cb402d8	[WebAssembly] Don't fold non-nuw add/sub in FastISel (#111278 ) We should not fold one of add/sub operands into a load/store's offset when `nuw` (no unsigned wrap) is not present, because the address calculation, which adds the offset with the operand, does not wrap. This is handled correctly in the normal ISel: `6de5305b3d/llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp (L328-L332)` but not in FastISel. This positivity check in FastISel is not sufficient to avoid this case fully: `6de5305b3d/llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (L348-L352)` because 1. Even if RHS is within signed int range, depending on the value of the LHS, the resulting value can exceed uint32 max. 2. When one of the operands is a label, `Address` can contain a `GlobalValue` and a `Reg` at the same time, so the `GlobalValue` becomes incorrectly an offset: `6de5305b3d/llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (L53-L69)` `6de5305b3d/llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (L409-L417)` Both cases are in the newly added test. We should handle `SUB` too because `SUB` is the same as `ADD` when RHS's sign changes. I checked why our current normal ISel only handles `ADD`, and the reason it's OK for the normal ISel to handle only `ADD` seems that DAGCombiner replaces `SUB` with `ADD` here: `6de5305b3d/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L3904-L3907)` Fixes #111018.	2024-10-09 14:31:16 -07:00
Nikita Popov	4b3ba64ba7	[SCEVExpander] Clear flags when reusing GEP (#109293 ) As pointed out in the review of #102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.	2024-10-01 14:22:54 +02:00
Craig Topper	92a8b81bdf	[LegalizeVectorOps] Enable ExpandFABS/COPYSIGN to use integer ops for fixed vectors in some cases. (#109232 ) Copy the same FSUB check from ExpandFNEG to avoid breaking AArch64 and ARM.	2024-09-30 11:44:49 -07:00
Simon Pilgrim	f8f0a266e0	[clang][wasm] Replace the target integer sub saturate intrinsics with the equivalent generic `__builtin_elementwise_sub_sat` intrinsics (#109405 ) Remove the Intrinsic::wasm_sub_sat_signed/wasm_sub_sat_unsigned entries and just use sub_sat_s/sub_sat_u directly	2024-09-22 10:12:41 +01:00
Heejin Ahn	08bba6503b	[WebAssembly] Support binary generation for new EH (#109027 ) This adds support for binary generation for the new EH proposal. So far the only case that we emitted variable immediate operands in binary has been `br_table`'s destinations. (Other `variable_ops` uses in TableGen files are register operands, such as the operands of `call`, so they don't get emitted in binary as a part of the same instruction.) With this PR, variable immediate operands can include `try_table`'s operands: - The number of of catch clauses - catch clauses sub-opcodes - `catch`: 0x00 - `catch_ref`: 0x01 - `catch_all`: 0x02 - `catch_all_ref`: 0x03 - catch clauses' destinations With `try_table`, we now have variable expr operands for `try_table`'s catch clauses' tags. We treat their fixups in the same way we do for tags in other instructions such as in `throw`. Diff without whitespace will be easier to view.	2024-09-17 14:58:19 -07:00
Brendan Dahl	c076638c70	[WebAssembly] Support BUILD_VECTOR with F16x8. (#108117 ) Convert BUILD_VECTORS with FP16x8 to I16x8 since there's no FP16 scalar value to intialize v128.const.	2024-09-11 10:00:10 -07:00
Brendan Dahl	415288a2a7	[WebAssembly] Add load and store patterns for V8F16. (#108119 )	2024-09-11 09:53:53 -07:00
Heejin Ahn	6bbf7f06d8	[WebAssembly] Add assembly support for final EH proposal (#107917 ) This adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This adds support for the generation of new `try_table` and `throw_ref` instruction in .s asesmbly format. This does NOT yet include - Block annotation comment generation for .s format - .o object file generation - .s assembly parsing - Type checking (AsmTypeCheck) - Disassembler - Fixing unwind mismatches in CFGStackify These will be added as follow-up PRs. --- The format for `TRY_TABLE`, both for `MachineInstr` and `MCInst`, is as follows: ``` TRY_TABLE type number_of_catches catch_clauses* ``` where `catch_clause` is ``` catch_opcode tag+ destination ``` `catch_opcode` should be one of 0/1/2/3, which denotes `CATCH`/`CATCH_REF`/`CATCH_ALL`/`CATCH_ALL_REF` respectively. (See `BinaryFormat/Wasm.h`) `tag` exists when the catch is one of `CATCH` or `CATCH_REF`. The MIR format is printed as just the list of raw operands. The (stack-based) assembly instruction supports pretty-printing, including printing `catch` clauses by name, in InstPrinter. In addition to the new instructions `TRY_TABLE` and `THROW_REF`, this adds four pseudo instructions: `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF`. These are pseudo instructions to simulate block return values of `catch`, `catch_ref`, `catch_all`, `catch_all_ref` clauses in `try_table` respectively, given that we don't support block return values except for one case (`fixEndsAtEndOfFunction` in CFGStackify). These will be omitted when we lower the instructions to `MCInst` at the end. LateEHPrepare now will have one more stage to covert `CATCH`/`CATCH_ALL`s to `CATCH_REF`/`CATCH_ALL_REF`s when there is a `RETHROW` to rethrow its exception. The pass also converts `RETHROW`s into `THROW_REF`. Note that we still use `RETHROW` as an interim pseudo instruction until we convert them to `THROW_REF` in LateEHPrepare. CFGStackify has a new `placeTryTableMarker` function, which places `try_table`/`end_try_table` markers with a necessary `catch` clause and also `block`/`end_block` markers for the destination of the `catch` clause. In MCInstLower, now we need to support one more case for the multivalue block signature (`catch_ref`'s destination's `(i32, exnref)` return type). InstPrinter has a new routine to print the `catch_list` type, which is used to print `try_table` instructions. The new test, `exception.ll`'s source is the same as `exception-legacy.ll`, with the FileCheck expectations changed. One difference is the commands in this file have `-wasm-enable-exnref` to test the new format, and don't have `-wasm-disable-explicit-locals -wasm-keep-registers`, because the new custom InstPrinter routine to print `catch_list` only works for the stack-based instructions (`_S`), and we can't use `-wasm-keep-registers` for them. As in `exception-legacy.ll`, the FileCheck lines for the new tests do not contain the whole program; they mostly contain only the control flow instructions for readability.	2024-09-10 21:32:24 -07:00
anjenner	4af249fe6e	Add usub_cond and usub_sat operations to atomicrmw (#105568 ) These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.	2024-09-06 16:19:20 +01:00
Heejin Ahn	aecbc92410	[WebAssembly] Rename CATCH/CATCH_ALL to *_LEGACY (#107187 ) This renames MIR instruction `CATCH` and `CATCH_ALL` to `CATCH_LEGACY` and `CATCH_ALL_LEGACY` respectively. Follow-up PRs for the new EH (exnref) implementation will use `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF` as pseudo-instructions that return extracted values or `exnref` or both, because we don't currently support block return values in LLVM. So to give the old (real) `CATCH`es and the new (pseudo) `CATCH`es different names, this attaches `_LEGACY` prefix to the old names. This also rearranges `WebAssemblyInstrControl.td` so that the old legacy instructions are listed all together at the end.	2024-09-04 16:14:13 -07:00
Heejin Ahn	b2223b4d7e	[WebAssembly] Rename legacy EH mir tests (#107189 ) We added `-legacy` suffix to the legacy EH `ll` tests in #107166 but forgot to do the same for `mir` tests.	2024-09-04 09:52:42 -07:00
Heejin Ahn	8b28e2ebb3	[WebAssembly] Rename legacy EH tests (#107166 ) Give each test in `cfg-stackify-eh-legacy.ll` a name rather than something like `test5`, because I plan to copy many of these test into a new file that tests for the new EH (exnref) and some of the tests here are not applicable to the new EH so the numbering will be different, which can make things confusing. Also this removes `test_` prefixes in the test function names in `exception-legacy.ll`, because, well, we all know they are tests.	2024-09-03 21:14:36 -07:00
Brendan Dahl	5703d8572f	[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465 ) Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to_<s,i>_I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.	2024-08-30 08:42:37 -07:00
Brendan Dahl	7d373cef49	[WebAssembly] Change half-precision feature name to fp16. (#105434 ) This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.	2024-08-22 09:44:33 -07:00
Froster	234cb4c6e3	[SelectionDAG] Scalarize binary ops of splats before legal types (#100749 ) Fixes #65072. This allows binary ops of splats to be scalarized if the operation isn't legal on the element type isn't legal, but is legal on the type it will be legalized to. I assume if an Op is legal both in scalar and vector, choose scalar version should always be better no matter what the type is. There are some cases that my approach can't scalarize, for example: ``` llvm ; test/CodeGen/RISCV/rvv/select-int.ll define <vscale x 4 x i64> @select_nxv4i64(i1 zeroext %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b) { %v = select i1 %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b ret <vscale x 4 x i64> %v } ``` https://godbolt.org/z/xzqrKrxvK `xor (splat i1, splat i1)` is generated in late step after LegalizeType, from select. I didn't figure out how to make `xor i1, i1` legal at this time. --------- Co-authored-by: Luke Lau <luke@igalia.com>	2024-08-15 00:07:00 +08:00
Hari Limaye	94473f4db6	[IRBuilder] Generate nuw GEPs for struct member accesses (#99538 ) Generate nuw GEPs for struct member accesses, as inbounds + non-negative implies nuw. Regression tests are updated using update scripts where possible, and by find + replace where not.	2024-08-09 13:25:04 +01:00
Nikita Popov	0564d0665b	[SDAG] Transfer gep nusw/nuw to SDAG The resulting add is nuw if either the gep was nuw or it was nusw+nneg. Previously only inbounds+nneg was handled. Test via wasm load offsets, which seems to most directly expose these SDAG flags.	2024-08-07 09:26:10 +02:00
Sam Parker	76c4529515	[WebAssembly] Fix assertion in LowerBUILD_VECTOR (#101961 ) The assertion was failing in the case where we were trying to lower to loadxx_zero, but lane zero was undef.	2024-08-05 14:38:12 -07:00
Sam Parker	08decd20a9	[WebAssembly] load_zero to initialise build_vector (#100610 ) Instead of splatting a single lane, to initialise a build_vector, lower to scalar_to_vector which can be selected to load_zero. Also add load_zero and load_lane patterns for f32x4 and f64x2.	2024-08-02 10:11:21 +01:00
Heejin Ahn	0af7542135	Reapply "[WebAssembly] Fix phi handling for Wasm SjLj (#99730 )" This reapplies #99730. #99730 contained a nondeterministic iteration which failed the reverse-iteration bot (https://lab.llvm.org/buildbot/#/builders/110/builds/474) and reverted in `f3f0d9928f`. The fix is make the order of iteration of new predecessors determintistic by using `SmallSetVector`. ```diff --- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp +++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp @@ -1689,7 +1689,7 @@ void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj( } } - SmallDenseMap<BasicBlock , SmallPtrSet<BasicBlock , 4>, 4> + SmallDenseMap<BasicBlock , SmallSetVector<BasicBlock , 4>, 4> UnwindDestToNewPreds; for (auto *CI : LongjmpableCalls) { // Even if the callee function has attribute 'nounwind', which is true for ```	2024-07-25 00:00:59 +00:00
Brendan Dahl	0dbd72d6ab	[WebAssembly] Implement f16x8.replace_lane instruction. (#99388 ) Use a builtin and intrinsic until half types are better supported for instruction selection.	2024-07-24 11:55:36 -07:00
Sam Parker	a3de21cac1	[WebAssembly] Ofast pmin/pmax pattern matchers (#100107 ) With fast-math, the ordered setcc nodes are converted to setcc nodes which do not care about NaNs, so add patterns that use setlt, setle, setgt and setge.	2024-07-24 09:23:49 +01:00
Heejin Ahn	f3f0d9928f	Revert "[WebAssembly] Fix phi handling for Wasm SjLj (#99730 )" This reverts commit 2bf71b8bc851b49745b795f228037db159005570. This broke the builbot at https://lab.llvm.org/buildbot/#/builders/110/builds/474.	2024-07-24 00:14:58 +00:00
Heejin Ahn	2bf71b8bc8	[WebAssembly] Fix phi handling for Wasm SjLj (#99730 ) In Wasm SjLj, longjmpable `call`s that in functions that call `setjmp` are converted into `invoke`s. Those `invoke`s are meant to unwind to `catch.dispatch.longjmp` to figure out which `setjmp` those `longjmp` buffers belong to: `fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L250-L260)` But in case a longjmpable call is within another `catchpad` or `cleanuppad` scope, to maintain the nested scope structure, we should make them unwind to the scope's next unwind destination and not directly to `catch.dispatch.longjmp`: `fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1698-L1727)` In this case the longjmps will eventually unwind to `catch.dispatch.longjmp` and be handled there. In this case, it is possible that the unwind destination (which is an existing `catchpad` or `cleanuppad`) may already have `phi`s. And because the unwind destinations get new predecessors because of the newly created `invoke`s, those `phi`s need to have new entries for those new predecessors. This adds new preds as new incoming blocks to those `phi`s, and we use a separate `SSAUpdater` to calculate the correct incoming values to those blocks. I have assumed `SSAUpdaterBulk` used in `rebuildSSA` would take care of these things, but apparently it doesn't. It takes available defs and adds `phi`s in the defs' dominance frontiers, i.e., where each def's dominance ends, and rewrites other uses based on the newly added `phi`s. But it doesn't add entries to existing `phi`s, and the case in this bug may not even involve dominance frontiers; this bug is simply about existing `phis`s that have gained new preds need new entries for them. It is kind of surprising that this bug was only reported recently, given that this pass has not been changed much in years. Fixes #97496 and fixes https://github.com/emscripten-core/emscripten/issues/22170.	2024-07-23 16:06:00 -07:00
Heejin Ahn	735852f5ab	[WebAssembly] Enable simd128 when relaxed-simd is set in AsmPrinter (#99803 ) Even though in `Subtarget` we defined `SIMDLevel` as a number so `hasRelaxedSIMD` automatically means `hasSIMD128`, `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L36-L40)` `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L107)` specifying only `relaxed-simd` feature on a program that needs `simd128` instructions to compile fails, because of this query in `AsmPrinter`: `d0d05aec3b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (L644-L645)` This `verifyInstructionPredicates` function (and other functions called by this function) is generated by https://github.com/llvm/llvm-project/blob/main/llvm/utils/TableGen/InstrInfoEmitter.cpp, and looks like this (you can check it in the `lib/Target/WebAssembly/WebAssemblyGenInstrInfo.inc` in your build directory): ```cpp void verifyInstructionPredicates( unsigned Opcode, const FeatureBitset &Features) { FeatureBitset AvailableFeatures = computeAvailableFeatures(Features); FeatureBitset RequiredFeatures = computeRequiredFeatures(Opcode); FeatureBitset MissingFeatures = (AvailableFeatures & RequiredFeatures) ^ RequiredFeatures; ... } ``` And `computeAvailableFeatures` is just a set query, like this: ```cpp inline FeatureBitset computeAvailableFeatures(const FeatureBitset &FB) { FeatureBitset Features; if (FB[WebAssembly::FeatureAtomics]) Features.set(Feature_HasAtomicsBit); if (FB[WebAssembly::FeatureBulkMemory]) Features.set(Feature_HasBulkMemoryBit); if (FB[WebAssembly::FeatureExceptionHandling]) Features.set(Feature_HasExceptionHandlingBit); ... ``` So this is how currently `HasSIMD128` is defined: `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td (L79-L81)` The things being checked in this `computeAvailableFeatures`, and in turn in `AsmPrinter`, are `AssemblerPredicate`s. These only check which bits are set in the features set and are different from `Predicate`s, which can call `Subtarget` functions like `Subtarget->hasSIMD128()`. But apparently we can use `all_of` and `any_of` directives in `AssemblerPredicate`, and we can make `simd128`'s `AssemblerPredicate` set in `relaxed-simd` is set by the condition as an 'or' of the two. Fixes #98502.	2024-07-23 11:50:56 -07:00
Farzon Lotfi	def3944df8	[WebAssembly] Add Support for Arc and Hyperbolic trig llvm intrinsics (#98755 ) ## Change: - WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by WASM - Add trig specific test cases ## History This change is part of an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 ## Why Web Assembly? From past changes to try and support constraint intrinsics the changes to the trig builtins to emit intrinsics\constraint intrinsics broke the WASM build. This is an attempt to preempt any such build break. - https://github.com/llvm/llvm-project/pull/95082 - https://github.com/llvm/llvm-project/pull/94559#issuecomment-2159923215	2024-07-19 10:18:58 -04:00
Sam Parker	d28ed29d6b	[TTI][WebAssembly] Pairwise reduction expansion (#93948 ) WebAssembly doesn't support horizontal operations nor does it have a way of expressing fast-math or reassoc flags, so runtimes are currently unable to use pairwise operations when generating code from the existing shuffle patterns. This patch allows the backend to select which, arbitary, shuffle pattern to be used per reduction intrinsic. The default behaviour is the same as the existing, which is by splitting the vector into a top and bottom half. The other pattern introduced is for a pairwise shuffle. WebAssembly enables pairwise reductions for int/fp add/sub.	2024-07-17 09:21:52 +01:00
Volodymyr Vasylkun	e094abde42	[SelectionDAG] Expand [US]CMP using arithmetic on boolean values instead of selects (#98774 ) The previous expansion of [US]CMP was done using two selects and two compares. It produced decent code, but on many platforms it is better to implement [US]CMP nodes by performing the following operation: ``` [us]cmp(x, y) = (x [us]> y) - (x [us]< y) ``` This patch adds this new expansion, as well as a hook in TargetLowering to allow some targets to still use the select-based approach. AArch64 and SystemZ are currently the only targets to prefer the former approach, but other targets may also start to use it if it provides for better codegen.	2024-07-16 20:56:18 +01:00
Heejin Ahn	fb6e024f49	[WebAssembly] Update generic and bleeding-edge CPUs (#96584 ) This updates the list of features in 'generic' and 'bleeding-edge' CPUs in the backend to match `4e0a0eae58/clang/lib/Basic/Targets/WebAssembly.cpp (L150-L178)` This updates existing CodeGen tests in a way that, if a test has separate RUN lines for a reference-types test and a non-reference-types test, I added -mattr=-reference-types to the no-reftype test's RUN command line. I didn't delete existing -mattr=+reference-types lines in reftype tests because having it helps readability. Also, when tests is not really about reference-types but they have to updated because they happen to contain call_indirect lines because now call_indirect will take __indirect_function_table as an argument, I just added the table argument to the expected output. `target-features-cpus.ll` has been updated reflecting the newly added features.	2024-07-01 19:12:01 -07:00
Heejin Ahn	a54704de0d	[WebAssembly] Split and tidy up target features test (#96735 ) This splits `target-features.ll` into two tests: `target-features-attrs.ll` and `target-features-cpus.ll`. Now `target-features-attrs.ll` contains tests with bitcode function attributes and `-mattr=` options. The current `target-features.ll` file's FileCheck lines are confusing, mainly because it is unclear how `CHECK` and `ATTRS` lines are meant to be different. Turns out, before `67ec8744d7`, `-mattr=` options used to override any existing bitcode function attributes, but after the commit that's not the case anymore. So the original test had a line that tested `i32.atomic.rmw.cmpxchg` was not generated when `-mattr=+simd128` was given (because the existing `+atomics` in the function attributes is overriden). That commit deleted that line and changed some `ATTRS` lines into `CHECK`, which was confusing. This PR simplifies that part and does not test the absence of any instructions, and the effect of `-mattr=` option is only tested with the target features section. And `target-features-cpus.ll` only tests the sets of features enabled by `-mcpu=` lines. It is better to have this as a separate file because once you have bitcode function attributes they end up in the target features section too, making the testing of only the `-mcpu=` options difficult.	2024-06-26 13:28:55 -07:00
Heejin Ahn	1822e3183d	[WebAssembly] Rename target-features.ll (#96716 ) I'm planning on a PR that splits `target-features.ll` into two different files and fix some other stuff on them: - `target-features-attrs.ll` that tests target features by bitcode function attributes and `-mattr=` options - `target-features-cpus.ll` that tests target features by `-mcpu=` options But `target-features-attrs.ll` will share a bulk of the lines with the current `target-features.ll`. And if I remove `target-features.ll` and create the two new files in a single PR, git doesn't recognize either of them as a copy (I hoped at least `target-features-attrs.ll` would be recognized as a copy because it shares many lines with the current file) So to make the diff smaller and easier to review, I'm renaming the file first. I'll follow up with the PR that does the actual splitting.	2024-06-25 23:14:04 -07:00
Brendan Dahl	928b780840	[WebAssembly] Implement trunc_sat and convert instructions for f16x8. (#95180 ) These instructions can be generated using regular LL intrinsics. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-06-25 10:39:05 -07:00
Heejin Ahn	3c8f3b91d8	[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967 ) `rethrow` instruction is a terminator, but when when its DAG is built in `SelectionDAGBuilder` in a custom routine, it was NOT treated as such. ```ll rethrow: ; preds = %catch.start invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ] to label %unreachable unwind label %ehcleanup ehcleanup: ; preds = %rethrow, %catch.dispatch %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ] ... ``` In this bitcode, because of the `phi`, a `CONST_I32` will be created in the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks like this: ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161> t6: ch = TokenFactor t3, t5 t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50> ``` Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so either can come first in the selected code, which can result in the code like ```mir bb.3.rethrow: RETHROW 0, implicit-def dead $arguments %9:i32 = CONST_I32 20, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` After this patch, `llvm.wasm.rethrow` is treated as a terminator, and the DAG will look like ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161> t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70> ``` Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has to come after `CopyToReg`. And the resulting code will be ```mir bb.3.rethrow: %9:i32 = CONST_I32 20, implicit-def dead $arguments RETHROW 0, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` I'm not very familiar with the internals of `getRoot` vs. `getControlRoot`, but other terminator instructions seem to use the latter, and using it for `rethrow` too worked.	2024-06-18 21:56:41 -07:00
Farzon Lotfi	6355fb45a5	[CodeGen] Support vectors across all backends (#95518 ) Add a default f16 type promotion	2024-06-14 17:18:20 -04:00

1 2 3 4 5 ...

1190 Commits