llvm-project

Author	SHA1	Message	Date
Brendan Dahl	5703d8572f	[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465 ) Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to_<s,i>_I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.	2024-08-30 08:42:37 -07:00
Brendan Dahl	7d373cef49	[WebAssembly] Change half-precision feature name to fp16. (#105434 ) This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.	2024-08-22 09:44:33 -07:00
Froster	234cb4c6e3	[SelectionDAG] Scalarize binary ops of splats before legal types (#100749 ) Fixes #65072. This allows binary ops of splats to be scalarized if the operation isn't legal on the element type isn't legal, but is legal on the type it will be legalized to. I assume if an Op is legal both in scalar and vector, choose scalar version should always be better no matter what the type is. There are some cases that my approach can't scalarize, for example: ``` llvm ; test/CodeGen/RISCV/rvv/select-int.ll define <vscale x 4 x i64> @select_nxv4i64(i1 zeroext %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b) { %v = select i1 %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b ret <vscale x 4 x i64> %v } ``` https://godbolt.org/z/xzqrKrxvK `xor (splat i1, splat i1)` is generated in late step after LegalizeType, from select. I didn't figure out how to make `xor i1, i1` legal at this time. --------- Co-authored-by: Luke Lau <luke@igalia.com>	2024-08-15 00:07:00 +08:00
Hari Limaye	94473f4db6	[IRBuilder] Generate nuw GEPs for struct member accesses (#99538 ) Generate nuw GEPs for struct member accesses, as inbounds + non-negative implies nuw. Regression tests are updated using update scripts where possible, and by find + replace where not.	2024-08-09 13:25:04 +01:00
Nikita Popov	0564d0665b	[SDAG] Transfer gep nusw/nuw to SDAG The resulting add is nuw if either the gep was nuw or it was nusw+nneg. Previously only inbounds+nneg was handled. Test via wasm load offsets, which seems to most directly expose these SDAG flags.	2024-08-07 09:26:10 +02:00
Sam Parker	76c4529515	[WebAssembly] Fix assertion in LowerBUILD_VECTOR (#101961 ) The assertion was failing in the case where we were trying to lower to loadxx_zero, but lane zero was undef.	2024-08-05 14:38:12 -07:00
Sam Parker	08decd20a9	[WebAssembly] load_zero to initialise build_vector (#100610 ) Instead of splatting a single lane, to initialise a build_vector, lower to scalar_to_vector which can be selected to load_zero. Also add load_zero and load_lane patterns for f32x4 and f64x2.	2024-08-02 10:11:21 +01:00
Heejin Ahn	0af7542135	Reapply "[WebAssembly] Fix phi handling for Wasm SjLj (#99730 )" This reapplies #99730. #99730 contained a nondeterministic iteration which failed the reverse-iteration bot (https://lab.llvm.org/buildbot/#/builders/110/builds/474) and reverted in `f3f0d9928f`. The fix is make the order of iteration of new predecessors determintistic by using `SmallSetVector`. ```diff --- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp +++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp @@ -1689,7 +1689,7 @@ void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj( } } - SmallDenseMap<BasicBlock , SmallPtrSet<BasicBlock , 4>, 4> + SmallDenseMap<BasicBlock , SmallSetVector<BasicBlock , 4>, 4> UnwindDestToNewPreds; for (auto *CI : LongjmpableCalls) { // Even if the callee function has attribute 'nounwind', which is true for ```	2024-07-25 00:00:59 +00:00
Brendan Dahl	0dbd72d6ab	[WebAssembly] Implement f16x8.replace_lane instruction. (#99388 ) Use a builtin and intrinsic until half types are better supported for instruction selection.	2024-07-24 11:55:36 -07:00
Sam Parker	a3de21cac1	[WebAssembly] Ofast pmin/pmax pattern matchers (#100107 ) With fast-math, the ordered setcc nodes are converted to setcc nodes which do not care about NaNs, so add patterns that use setlt, setle, setgt and setge.	2024-07-24 09:23:49 +01:00
Heejin Ahn	f3f0d9928f	Revert "[WebAssembly] Fix phi handling for Wasm SjLj (#99730 )" This reverts commit 2bf71b8bc851b49745b795f228037db159005570. This broke the builbot at https://lab.llvm.org/buildbot/#/builders/110/builds/474.	2024-07-24 00:14:58 +00:00
Heejin Ahn	2bf71b8bc8	[WebAssembly] Fix phi handling for Wasm SjLj (#99730 ) In Wasm SjLj, longjmpable `call`s that in functions that call `setjmp` are converted into `invoke`s. Those `invoke`s are meant to unwind to `catch.dispatch.longjmp` to figure out which `setjmp` those `longjmp` buffers belong to: `fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L250-L260)` But in case a longjmpable call is within another `catchpad` or `cleanuppad` scope, to maintain the nested scope structure, we should make them unwind to the scope's next unwind destination and not directly to `catch.dispatch.longjmp`: `fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1698-L1727)` In this case the longjmps will eventually unwind to `catch.dispatch.longjmp` and be handled there. In this case, it is possible that the unwind destination (which is an existing `catchpad` or `cleanuppad`) may already have `phi`s. And because the unwind destinations get new predecessors because of the newly created `invoke`s, those `phi`s need to have new entries for those new predecessors. This adds new preds as new incoming blocks to those `phi`s, and we use a separate `SSAUpdater` to calculate the correct incoming values to those blocks. I have assumed `SSAUpdaterBulk` used in `rebuildSSA` would take care of these things, but apparently it doesn't. It takes available defs and adds `phi`s in the defs' dominance frontiers, i.e., where each def's dominance ends, and rewrites other uses based on the newly added `phi`s. But it doesn't add entries to existing `phi`s, and the case in this bug may not even involve dominance frontiers; this bug is simply about existing `phis`s that have gained new preds need new entries for them. It is kind of surprising that this bug was only reported recently, given that this pass has not been changed much in years. Fixes #97496 and fixes https://github.com/emscripten-core/emscripten/issues/22170.	2024-07-23 16:06:00 -07:00
Heejin Ahn	735852f5ab	[WebAssembly] Enable simd128 when relaxed-simd is set in AsmPrinter (#99803 ) Even though in `Subtarget` we defined `SIMDLevel` as a number so `hasRelaxedSIMD` automatically means `hasSIMD128`, `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L36-L40)` `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L107)` specifying only `relaxed-simd` feature on a program that needs `simd128` instructions to compile fails, because of this query in `AsmPrinter`: `d0d05aec3b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (L644-L645)` This `verifyInstructionPredicates` function (and other functions called by this function) is generated by https://github.com/llvm/llvm-project/blob/main/llvm/utils/TableGen/InstrInfoEmitter.cpp, and looks like this (you can check it in the `lib/Target/WebAssembly/WebAssemblyGenInstrInfo.inc` in your build directory): ```cpp void verifyInstructionPredicates( unsigned Opcode, const FeatureBitset &Features) { FeatureBitset AvailableFeatures = computeAvailableFeatures(Features); FeatureBitset RequiredFeatures = computeRequiredFeatures(Opcode); FeatureBitset MissingFeatures = (AvailableFeatures & RequiredFeatures) ^ RequiredFeatures; ... } ``` And `computeAvailableFeatures` is just a set query, like this: ```cpp inline FeatureBitset computeAvailableFeatures(const FeatureBitset &FB) { FeatureBitset Features; if (FB[WebAssembly::FeatureAtomics]) Features.set(Feature_HasAtomicsBit); if (FB[WebAssembly::FeatureBulkMemory]) Features.set(Feature_HasBulkMemoryBit); if (FB[WebAssembly::FeatureExceptionHandling]) Features.set(Feature_HasExceptionHandlingBit); ... ``` So this is how currently `HasSIMD128` is defined: `0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td (L79-L81)` The things being checked in this `computeAvailableFeatures`, and in turn in `AsmPrinter`, are `AssemblerPredicate`s. These only check which bits are set in the features set and are different from `Predicate`s, which can call `Subtarget` functions like `Subtarget->hasSIMD128()`. But apparently we can use `all_of` and `any_of` directives in `AssemblerPredicate`, and we can make `simd128`'s `AssemblerPredicate` set in `relaxed-simd` is set by the condition as an 'or' of the two. Fixes #98502.	2024-07-23 11:50:56 -07:00
Farzon Lotfi	def3944df8	[WebAssembly] Add Support for Arc and Hyperbolic trig llvm intrinsics (#98755 ) ## Change: - WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by WASM - Add trig specific test cases ## History This change is part of an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 ## Why Web Assembly? From past changes to try and support constraint intrinsics the changes to the trig builtins to emit intrinsics\constraint intrinsics broke the WASM build. This is an attempt to preempt any such build break. - https://github.com/llvm/llvm-project/pull/95082 - https://github.com/llvm/llvm-project/pull/94559#issuecomment-2159923215	2024-07-19 10:18:58 -04:00
Sam Parker	d28ed29d6b	[TTI][WebAssembly] Pairwise reduction expansion (#93948 ) WebAssembly doesn't support horizontal operations nor does it have a way of expressing fast-math or reassoc flags, so runtimes are currently unable to use pairwise operations when generating code from the existing shuffle patterns. This patch allows the backend to select which, arbitary, shuffle pattern to be used per reduction intrinsic. The default behaviour is the same as the existing, which is by splitting the vector into a top and bottom half. The other pattern introduced is for a pairwise shuffle. WebAssembly enables pairwise reductions for int/fp add/sub.	2024-07-17 09:21:52 +01:00
Volodymyr Vasylkun	e094abde42	[SelectionDAG] Expand [US]CMP using arithmetic on boolean values instead of selects (#98774 ) The previous expansion of [US]CMP was done using two selects and two compares. It produced decent code, but on many platforms it is better to implement [US]CMP nodes by performing the following operation: ``` [us]cmp(x, y) = (x [us]> y) - (x [us]< y) ``` This patch adds this new expansion, as well as a hook in TargetLowering to allow some targets to still use the select-based approach. AArch64 and SystemZ are currently the only targets to prefer the former approach, but other targets may also start to use it if it provides for better codegen.	2024-07-16 20:56:18 +01:00
Heejin Ahn	fb6e024f49	[WebAssembly] Update generic and bleeding-edge CPUs (#96584 ) This updates the list of features in 'generic' and 'bleeding-edge' CPUs in the backend to match `4e0a0eae58/clang/lib/Basic/Targets/WebAssembly.cpp (L150-L178)` This updates existing CodeGen tests in a way that, if a test has separate RUN lines for a reference-types test and a non-reference-types test, I added -mattr=-reference-types to the no-reftype test's RUN command line. I didn't delete existing -mattr=+reference-types lines in reftype tests because having it helps readability. Also, when tests is not really about reference-types but they have to updated because they happen to contain call_indirect lines because now call_indirect will take __indirect_function_table as an argument, I just added the table argument to the expected output. `target-features-cpus.ll` has been updated reflecting the newly added features.	2024-07-01 19:12:01 -07:00
Heejin Ahn	a54704de0d	[WebAssembly] Split and tidy up target features test (#96735 ) This splits `target-features.ll` into two tests: `target-features-attrs.ll` and `target-features-cpus.ll`. Now `target-features-attrs.ll` contains tests with bitcode function attributes and `-mattr=` options. The current `target-features.ll` file's FileCheck lines are confusing, mainly because it is unclear how `CHECK` and `ATTRS` lines are meant to be different. Turns out, before `67ec8744d7`, `-mattr=` options used to override any existing bitcode function attributes, but after the commit that's not the case anymore. So the original test had a line that tested `i32.atomic.rmw.cmpxchg` was not generated when `-mattr=+simd128` was given (because the existing `+atomics` in the function attributes is overriden). That commit deleted that line and changed some `ATTRS` lines into `CHECK`, which was confusing. This PR simplifies that part and does not test the absence of any instructions, and the effect of `-mattr=` option is only tested with the target features section. And `target-features-cpus.ll` only tests the sets of features enabled by `-mcpu=` lines. It is better to have this as a separate file because once you have bitcode function attributes they end up in the target features section too, making the testing of only the `-mcpu=` options difficult.	2024-06-26 13:28:55 -07:00
Heejin Ahn	1822e3183d	[WebAssembly] Rename target-features.ll (#96716 ) I'm planning on a PR that splits `target-features.ll` into two different files and fix some other stuff on them: - `target-features-attrs.ll` that tests target features by bitcode function attributes and `-mattr=` options - `target-features-cpus.ll` that tests target features by `-mcpu=` options But `target-features-attrs.ll` will share a bulk of the lines with the current `target-features.ll`. And if I remove `target-features.ll` and create the two new files in a single PR, git doesn't recognize either of them as a copy (I hoped at least `target-features-attrs.ll` would be recognized as a copy because it shares many lines with the current file) So to make the diff smaller and easier to review, I'm renaming the file first. I'll follow up with the PR that does the actual splitting.	2024-06-25 23:14:04 -07:00
Brendan Dahl	928b780840	[WebAssembly] Implement trunc_sat and convert instructions for f16x8. (#95180 ) These instructions can be generated using regular LL intrinsics. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-06-25 10:39:05 -07:00
Heejin Ahn	3c8f3b91d8	[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967 ) `rethrow` instruction is a terminator, but when when its DAG is built in `SelectionDAGBuilder` in a custom routine, it was NOT treated as such. ```ll rethrow: ; preds = %catch.start invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ] to label %unreachable unwind label %ehcleanup ehcleanup: ; preds = %rethrow, %catch.dispatch %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ] ... ``` In this bitcode, because of the `phi`, a `CONST_I32` will be created in the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks like this: ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161> t6: ch = TokenFactor t3, t5 t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50> ``` Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so either can come first in the selected code, which can result in the code like ```mir bb.3.rethrow: RETHROW 0, implicit-def dead $arguments %9:i32 = CONST_I32 20, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` After this patch, `llvm.wasm.rethrow` is treated as a terminator, and the DAG will look like ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161> t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70> ``` Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has to come after `CopyToReg`. And the resulting code will be ```mir bb.3.rethrow: %9:i32 = CONST_I32 20, implicit-def dead $arguments RETHROW 0, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` I'm not very familiar with the internals of `getRoot` vs. `getControlRoot`, but other terminator instructions seem to use the latter, and using it for `rethrow` too worked.	2024-06-18 21:56:41 -07:00
Farzon Lotfi	6355fb45a5	[CodeGen] Support vectors across all backends (#95518 ) Add a default f16 type promotion	2024-06-14 17:18:20 -04:00
Farzon Lotfi	38ccee0034	[WASM] Fix for wasi libc build break add tan to RuntimeLibcallSignatureTable (#95082 ) The wasm backend fetches the tan runtime lib call in `llvm/include/llvm/IR/RuntimeLibcalls.def` via `StaticLibcallNameMap()`, but ignores the runtime function because a function sinature mapping is not specified in RuntimeLibcallSignatureTable(). The fix is to specify the function signatures for float32-128. This is a fix for a build break reported on PR https://github.com/llvm/llvm-project/pull/94559#issuecomment-2159923215.	2024-06-11 10:43:51 -04:00
Matt Arsenault	84b026690d	DAG: Pass flags to FoldConstantArithmetic (#93663 ) There is simply way too much going on inside getNode. The complicated constant folding of vector handling works by looking for build_vector operands, and then tries to getNode the scalar element and then checks if constants were the result. As a side effect, this produces unused scalar operation nodes (previously, without flags). If the vector operation were later scalarized, it would find the flagless constant folding temporary and lose the flag. I don't think this is a reasonable way for constant folding to operate, but for now fix this by ensuring flags on the original operation are preserved in the temporary. This yields a clear code improvement for AMDGPU when f16 isn't legal. The Wasm cases switch from using a libcall to compare and select. We are evidently missing the fcmp+select to fminimum/fmaximum handling, but this would be further improved when that's handled. AArch64 also avoids the libcall, but looks worse and has a different call for some reason.	2024-06-06 16:44:07 +02:00
Jon Chesterfield	8516f54e6a	[AMDGPU] Implement variadic functions by IR lowering (#93362 ) This is a mostly-target-independent variadic function optimisation and lowering pass. It is only enabled for AMDGPU in this initial commit. The purpose is to make C style variadic functions a zero cost abstraction. They are lowered to equivalent IR which is then amenable to other optimisations. This is inherently slightly target specific but much less so than one might expect - the C varargs interface heavily constrains the ABI design divergence. The pass is primarily tested from webassembly. This is because wasm has a straightforward variadic lowering strategy which coincides exactly with what this pass transforms code into and a struct passing convention with few cases to check. Adding further targets conventions is straightforward and elided from this patch primarily to simplify the review. Implemented in other branches are Linux X86, AMD64, AArch64 and NVPTX. Testing for targets that have existing lowering for va_arg from clang is most efficiently done by checking that clang \| opt completely elides the variadic syntax from test cases. The lowering produces a struct for each call site which can be inspected to check the various alignment and indirections are correct. AMDGPU presently has no variadic support other than some ad hoc printf handling. Combined with the pass being inactive on all other targets landing this represents strict increase in capability with zero risk. Testing and refining will continue post commit. In addition to the compiler tests included here, a self contained x64 clang/musl toolchain was constructed using the "lowering" instead of the systemv ABI and used to build various C programs like lua and libxml2.	2024-06-06 10:44:53 +01:00
Brendan Dahl	dfd1a2f081	[WebAssembly] Implement all f16x8 unary instructions. (#94063 ) All of these instructions can be generated using regular LL intrinsics. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-06-04 13:06:16 -04:00
Nikita Popov	deab451e7a	[IR] Remove support for icmp and fcmp constant expressions (#93038 ) Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.	2024-06-04 08:31:03 +02:00
Brendan Dahl	8aa8019975	[WebAssembly] Implement all f16x8 relation instructions. (#93751 ) All of these instructions can be generated using regular LL instructions. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-05-30 09:02:17 -07:00
Brendan Dahl	60bce6eab4	[WebAssembly] Implement all f16x8 binary instructions. (#93360 ) This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-05-28 16:33:20 -07:00
Heejin Ahn	722a5fce58	[WebAssembly] Add -wasm-enable-exnref option (#93597 ) This adds `-wasm-enable-exnref`, which will enable the new EH instructions using `exnref` (adopted in Oct 2023 CG meeting): https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This option should be used with `-wasm-enable-eh`.	2024-05-28 16:27:04 -07:00
Heejin Ahn	c179d50fd3	[WebAssembly] Add exnref type (#93586 ) This adds (back) the exnref type restored in the new EH proposal adopted in Oct 2023 CG meeting: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x	2024-05-28 16:10:11 -07:00
Heejin Ahn	c108c1e945	[WebAssembly] Rename old EH tests to *-legacy (#93585 ) I think test files for the legacy and the new EH (exnref) are better be separate, and I'd like to use the current test file names for the new EH, rather than keeping the current files and naming the new ones as `-new` or something.	2024-05-28 13:26:36 -07:00
Heejin Ahn	08de0b3cf5	[WebAssembly] Add tests for EH/SjLj option errors (#93583 ) This adds tests for EH/SjLj option errors and swaps the error checking order for unimportant cosmetic reasons (I think checking EH/SjLj conflicts is more important than the model checking)	2024-05-28 11:36:48 -07:00
Brendan Dahl	4ebe9bba59	[WebAssembly] Implement prototype f16x8.extract_lane instruction. (#93272 ) Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is incorrect and will be changed to 0x121 soon.	2024-05-24 08:31:07 -07:00
Brendan Dahl	09c5525610	[WebAssembly] Implement prototype f16x8.splat instruction. (#93228 ) Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon.	2024-05-23 20:05:22 -07:00
Sam Clegg	39d32b238d	[WebAssembly] Use 64-bit table when targeting wasm64 (#92042 ) See https://github.com/WebAssembly/memory64/issues/51	2024-05-23 18:25:58 -07:00
Alex Voicu	10edb4991c	[Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182 ) At the moment, Clang is rather liberal in assuming that 0 (and by extension unqualified) is always a safe default. This does not work for targets that actually use a different value for the default / generic AS (for example, the SPIRV that obtains from HIPSPV or SYCL). This patch is a first, fairly safe step towards trying to clear things up by querying a modules' default AS from the target, rather than assuming it's 0, alongside fixing a few places where things break / we encode the 0 == DefaultAS assumption. A bunch of existing tests are extended to check for non-zero default AS usage.	2024-05-19 14:59:03 +01:00
Brendan Dahl	8a3277acbc	[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545 ) Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon.	2024-05-09 15:38:13 -07:00
Brendan Dahl	1a2a1fbd7c	[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906 ) Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon.	2024-05-07 11:33:10 -07:00
Congcong Cai	1a46229636	Revert "Revert "[WebAssembly] remove instruction after builtin trap" (#90354 )" (#90366 ) `llvm.trap` will be convert as unreachable which is terminator. Instruction after terminator will cause validation failed. This PR introduces a pass to clean instruction after terminator. Fixes: https://github.com/llvm/llvm-project/issues/68770 Reapply: #90207	2024-04-28 10:13:02 +08:00
Mehdi Amini	38a2051c52	Revert "[WebAssembly] remove instruction after builtin trap" (#90354 ) Reverts llvm/llvm-project#90207 LLD Bots are broken.	2024-04-27 21:14:46 +02:00
Congcong Cai	ff03f23be8	[WebAssembly] remove instruction after builtin trap (#90207 ) `llvm.trap` will be convert as `unreachable` which is terminator. Instruction after terminator will cause validation failed. This PR introduces a pass to clean instruction after terminator. Fixes: #68770.	2024-04-27 22:11:47 +08:00
Heejin Ahn	a22ffe54a3	[WebAssembly] Make RefTypeMem2Local recognize target-features (#88916 ) Currently we check `Subtarget->hasReferenceTypes()` to decide whether to run `RefTypeMem2Local` pass: `6133878227/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp (L491-L495)` This works fine when `-mattr=+reference-types` is given in the command line (of `llc` or of `wasm-ld` in case of LTO). This also works fine if the backend is called by Clang, because Clang's feature set will be passed to the backend when creating a `TargetMachine`: `ac791888bb/clang/lib/CodeGen/BackendUtil.cpp (L549-L550)` `ac791888bb/clang/lib/CodeGen/BackendUtil.cpp (L561-L562)` But if the backend compilation is called by `llc`, a `TargetMachine` is created here: `bf1ad1d267/llvm/tools/llc/llc.cpp (L554-L555)` And if the backend is called by `wasm-ld`'s LTO, a `TargetMachine` is created here: `ac791888bb/llvm/lib/LTO/LTOBackend.cpp (L513)` At this point, in the both places, the created `TargetMachine` only has access to target features given by the command line with `-mattr=` and doesn't have access to bitcode functions' `target-features` attribute. We later gather the target features used by functions and store that info in the `TargetMachine` in `CoalesceFeaturesAndStripAtomics`, `ac791888bb/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp (L202-L206)` but this runs in the pass pipeline driven by the pass manager, so this has not run by the time we check `Subtarget->hasReferenceTypes()` in `WebAssemblyPassConfig::addISelPrepare`. So currently `RefTypeMem2Local` would not run on those functions with `"target-features"="+reference-types"` attributes if the backend is called by `llc` or `wasm-ld`. So this makes `RefTypeMem2Local` pass run unconditionally, and checks `target-featurs` function attribute to decide whether to run the pass on each function. This allows the pass to run with `wasm-ld` + LTO and `llc`, even if `-mattr=+reference-types` is not explicitly given in the command line again, as long as `+reference-types` is in the function's `target-features` attribute. This also covers the case we give the target features by the command line like `llc -mattr=+reference-types` and not in the bitcode function's attribute, because attributes given in the command line will be stored in the function's attributes anyway: `bd28889732/llvm/lib/CodeGen/CommandFlags.cpp (L673-L674)` `bd28889732/llvm/lib/CodeGen/CommandFlags.cpp (L732-L733)` With this PR, - `lto0.test_externref_emjs` - `thinlto0.test_externref_emjs`, - `lto0.test_externref_emjs_dynlink`, - `thinlto0.test_externref_emjs_dynlnk` pass. These currently fail but don't get checked in the CI. I think they used to pass but started to fail after #83196, because we used to run mem2reg even with `-O0` before that. (`ltoN` (N > 0) tests are not affected because they run mem2reg anyway so they don't need `RefTypeMem2Local`)	2024-04-23 17:57:49 +09:00
Heejin Ahn	c921ac724f	[WebAssembly] Enable multivalue return when multivalue ABI is used (#88492 ) Multivalue feature of WebAssembly has been standardized for several years now. I think it makes sense to be able to enable it in the feature section by default for our clang/llvm-produced binaries so that the multivalue feature can be used as necessary when necessary within our toolchain and also when running other optimizers (e.g. wasm-opt) after the LLVM code generation. But some WebAssembly toolchains, such as Emscripten, do not provide both mulvalue-returning and not-multivalue-returning versions of libraries. Also allowing the uses of multivalue in the features section does not necessarily mean we generate them whenever we can to the fullest, which is a different code generation / optimization option. So this makes the lowering of multivalue returns conditional on the use of 'experimental-mv' target ABI. This ABI is turned off by default and turned on by passing `-Xclang -target-abi -Xclang experimental-mv` to `clang`, or `-target-abi experimental-mv` to `clang -cc1` or `llc`. But the purpose of this PR is not tying the multivalue lowering to this specific 'experimental-mv'. 'experimental-mv' is just one multivalue ABI we currently have, and it is still experimental, meaning it is not very well optimized or tuned for performance. (e.g. it does not have the limitation of the max number of multivalue-lowered values, which can be detrimental to performance.) We may change the name of this ABI, or improve it, or add a new multivalue ABI in the future. Also I heard that WASI is planning to add their multivalue ABI soon. So the plan is, whenever any one of multivalue ABIs is enabled, we enable the lowering of multivalue returns in the backend. We currently have only 'experimental-mv' in the repo so we only check for that in this PR. Related past discussions: #82714 https://github.com/WebAssembly/tool-conventions/pull/223#issuecomment-2008298652	2024-04-23 17:48:59 +09:00
Matthias Braun	acb7ddc5cf	[WebAssembly] Remove threadlocal.address when disabling TLS (#88209 ) Remove `llvm.threadlocal.address` intrinsic usage when disabling TLS. This fixes errors revealed by the stricter IR verification introduced in PR #87841.	2024-04-10 16:24:02 -07:00
Piotr Sobczak	5b59ae423a	[DAG] Preserve NUW when reassociating (#87621 ) Similarly to the generic case below, preserve the NUW flag when reassociating adds with constants.	2024-04-04 16:47:25 +02:00
Heejin Ahn	6b7ecc7979	Revert "[WebAssembly] Remove threwValue comparison after __wasm_setjmp_test (#86633 )" This reverts commit 52431fdb1ab8d29be078edd55250e06381e4b6b0. The PR assumed `__threwValue` couldn't be 0, but it could be when the thrown thing is not a longjmp but an exception, so that `if` check was actually necessary.	2024-03-28 04:41:29 +00:00
Heejin Ahn	52431fdb1a	[WebAssembly] Remove threwValue comparison after __wasm_setjmp_test (#86633 ) Currently the code thinks a `longjmp` occurred if both `__THREW__` and `__threwValue` are nonzero. But `__threwValue` can be 0, and the `longjmp` library function should change it to 1 in case it is 0: https://en.cppreference.com/w/c/program/longjmp Emscripten libraries were not consistent about that, but after https://github.com/emscripten-core/emscripten/pull/21493 and https://github.com/emscripten-core/emscripten/pull/21502, we correctly pass 1 in case the input is 0. So there will be no case `__threwValue` is 0. And regardless of what `longjmp` library function does, treating `longjmp`'s 0 input to its second argument as "not longjmping" doesn't seem right. I'm not sure where that `__threwValue` checking came from, but probably I was porting then fastcomp's implementation and moved this part just verbatim: `9bdc7bb4fc/lib/Target/JSBackend/CallHandlers.h (L274-L278)` Just for the context, how this was discovered: https://github.com/emscripten-core/emscripten/pull/21502#pullrequestreview-1942160300	2024-03-27 11:11:16 -07:00
YAMAMOTO Takashi	6420f37926	[WebAssembly] Implement an alternative translation for -wasm-enable-sjlj (#84137 ) Instead of maintaining per-function-invocation malloc()'ed tables to track which functions each label belongs to, store the equivalent info in jump buffers (jmp_buf) themselves. Also, use a less emscripten-looking ABI symbols: ``` saveSetjmp -> __wasm_setjmp testSetjmp -> __wasm_setjmp_test getTempRet0 -> (removed) __wasm_longjmp -> (no change) ``` While I want to use this for WASI, it should work for emscripten as well. An example runtime and a few tests: https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp wasi-libc version of the runtime: https://github.com/WebAssembly/wasi-libc/pull/483 emscripten version of the runtime: https://github.com/emscripten-core/emscripten/pull/21502 Discussion: https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit	2024-03-25 18:11:56 -07:00
Thomas Lively	767e0c8bce	[WebAssembly] Select BUILD_VECTOR with large unsigned lane values (#85880 ) Previously we expected lane constants to be in the range of signed values for each lane size, but the included test case produced large unsigned values that fall outside that range. Allow instruction selection to proceed in this case rather than failing. Fixes #63817.	2024-03-20 08:42:42 -07:00

1 2 3 4 5 ...

1162 Commits