llvm-project

Author	SHA1	Message	Date
yingopq	754ed95b66	[Mips] Fix compiler crash when returning fp128 after calling a functi… (#117525 ) …on returning { i8, i128 } Fixes https://github.com/llvm/llvm-project/issues/96432.	2025-01-20 16:47:40 +08:00
Sergei Barannikov	9ae92d7056	[SelectionDAG] Virtualize isTargetStrictFPOpcode / isTargetMemoryOpcode (#119969 ) With this change, targets are no longer required to put memory / strict-fp opcodes after special `ISD::FIRST_TARGET_MEMORY_OPCODE`/`ISD::FIRST_TARGET_STRICTFP_OPCODE` markers. This will also allow autogenerating `isTargetMemoryOpcode`/`isTargetStrictFPOpcode (#119709). Pull Request: https://github.com/llvm/llvm-project/pull/119969	2024-12-21 05:29:51 +03:00
David Sherwood	8630a7ba7c	Reapply "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#117566 )" (#118823 ) [Reverts d57892a2a153ab71a796f07e39d939eae6910c21] For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant. --------- Co-authored-by: Paul Walker <paul.walker@arm.com>	2024-12-09 10:56:44 +00:00
Vitaly Buka	d57892a2a1	Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc" (#118693 ) Reverts llvm/llvm-project#117566 Breaks libc++ tests with HWASAN https://lab.llvm.org/buildbot/#/builders/55/builds/3959	2024-12-04 12:36:46 -08:00
David Sherwood	4675db5f39	[DAGCombiner] Add support for scalarising extracts of a vector setcc (#117566 ) For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant.	2024-12-04 10:26:51 +00:00
Dan Gohman	c3536b263f	[WebAssembly] Define call-indirect-overlong and bulk-memory-opt features (#117087 ) This defines some new target features. These are subsets of existing features that reflect implementation concerns: - "call-indirect-overlong" - implied by "reference-types"; just the overlong encoding for the `call_indirect` immediate, and not the actual reference types. - "bulk-memory-opt" - implied by "bulk-memory": just `memory.copy` and `memory.fill`, and not the other instructions in the bulk-memory proposal. This is split out from https://github.com/llvm/llvm-project/pull/112035. --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-12-02 17:08:07 -08:00
Sam Clegg	ea58410d0f	[WebAssembly] Implement %llvm.thread.pointer intrinsic (#117817 ) We can simply use the `__tls_base` global for this which is guaranteed to be non-zero and unique per thread. Fixes: #117433	2024-11-26 17:19:14 -08:00
David Sherwood	9b76e7fc60	Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 )" (#117556 ) This reverts commit 22ec44f509ff266b581dbb490d7b040473b7c31a.	2024-11-25 13:49:21 +00:00
David Sherwood	22ec44f509	[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 ) For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant.	2024-11-25 09:25:01 +00:00
Kazu Hirata	43570a2841	[WebAssembly] Remove unused includes (NFC) (#116318 ) Identified with misc-include-cleaner.	2024-11-15 07:26:37 -08:00
Dan Gohman	118445841d	[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. (#112617 ) WebAssembly's `memory.fill` and `memory.copy` instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call `memcpy` on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around `memory.fill` and `memory.copy` when the length is zero. --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-10-24 14:13:58 -07:00
Jordan Rupprecht	33363521ca	[NFC][WebAssembly] Inline var only used in assertion (#113507 )	2024-10-23 18:51:25 -05:00
Alex Crichton	c2293b33dd	[WebAssembly] Implement the wide-arithmetic proposal (#111598 ) This commit implements the [wide-arithmetic] proposal which has recently reached phase 2 in the WebAssembly proposals process. The goal here is to implement support in LLVM for emitting these instructions which are gated behind a new feature flag by default. A new `wide-arithmetic` feature flag is introduced which gates these four new instructions from being emitted. Emission of each instruction itself is relatively simple given LLVM's preexisting lowering rules and infrastructure. The main gotcha is that due to the multi-result nature of all of these instructions it needed the lowerings to be implemented in C++ rather than in TableGen. [wide-arithmetic]: https://github.com/WebAssembly/wide-arithmetic	2024-10-23 11:39:58 -07:00
Jeffrey Byrnes	853c43d04a	[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564 ) Porting to TTI provides direct access to the instruction cost model, which can enable instruction cost based sinking without introducing code duplication.	2024-10-09 14:30:09 -07:00
Simon Pilgrim	f8f0a266e0	[clang][wasm] Replace the target integer sub saturate intrinsics with the equivalent generic `__builtin_elementwise_sub_sat` intrinsics (#109405 ) Remove the Intrinsic::wasm_sub_sat_signed/wasm_sub_sat_unsigned entries and just use sub_sat_s/sub_sat_u directly	2024-09-22 10:12:41 +01:00
Brendan Dahl	c076638c70	[WebAssembly] Support BUILD_VECTOR with F16x8. (#108117 ) Convert BUILD_VECTORS with FP16x8 to I16x8 since there's no FP16 scalar value to intialize v128.const.	2024-09-11 10:00:10 -07:00
Brendan Dahl	415288a2a7	[WebAssembly] Add load and store patterns for V8F16. (#108119 )	2024-09-11 09:53:53 -07:00
Brendan Dahl	5703d8572f	[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465 ) Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to_<s,i>_I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.	2024-08-30 08:42:37 -07:00
Sergei Barannikov	4d7a0abae8	[DataLayout] Change return type of `getStackAlignment` to `MaybeAlign` (#105478 ) Currently, `getStackAlignment` asserts if the stack alignment wasn't specified. This makes it inconvenient to use and complicates testing. This change also makes `exceedsNaturalStackAlignment` method redundant.	2024-08-27 22:59:33 +03:00
Brendan Dahl	7d373cef49	[WebAssembly] Change half-precision feature name to fp16. (#105434 ) This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.	2024-08-22 09:44:33 -07:00
Sam Parker	76c4529515	[WebAssembly] Fix assertion in LowerBUILD_VECTOR (#101961 ) The assertion was failing in the case where we were trying to lower to loadxx_zero, but lane zero was undef.	2024-08-05 14:38:12 -07:00
Sam Parker	08decd20a9	[WebAssembly] load_zero to initialise build_vector (#100610 ) Instead of splatting a single lane, to initialise a build_vector, lower to scalar_to_vector which can be selected to load_zero. Also add load_zero and load_lane patterns for f32x4 and f64x2.	2024-08-02 10:11:21 +01:00
Amara Emerson	f270a4dd66	[AArch64] Don't tail call memset if it would convert to a bzero. (#98969 ) Well, not quite that simple. We can tc memset since it returns the first argument but bzero doesn't do that and therefore we can end up miscompiling. This patch also refactors the logic out of isInTailCallPosition() into the callers. As a result memcpy and memmove are also modified to do the same thing for consistency. rdar://131419786	2024-07-17 01:31:52 -07:00
Roger Ferrer Ibáñez	05e6bb40eb	[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795 ) The current way of lowering `llvm.clear_cache` is a bit unusual. As suggested by Matt Arsenault we are better off using an ISD node. This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall by default named `__clear_cache` and the default legalisation is a libcall. This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE` needed by RISC-V on some platforms.	2024-05-30 14:55:32 +02:00
Brendan Dahl	60bce6eab4	[WebAssembly] Implement all f16x8 binary instructions. (#93360 ) This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-05-28 16:33:20 -07:00
Heejin Ahn	c179d50fd3	[WebAssembly] Add exnref type (#93586 ) This adds (back) the exnref type restored in the new EH proposal adopted in Oct 2023 CG meeting: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x	2024-05-28 16:10:11 -07:00
Brendan Dahl	09c5525610	[WebAssembly] Implement prototype f16x8.splat instruction. (#93228 ) Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon.	2024-05-23 20:05:22 -07:00
Sam Clegg	39d32b238d	[WebAssembly] Use 64-bit table when targeting wasm64 (#92042 ) See https://github.com/WebAssembly/memory64/issues/51	2024-05-23 18:25:58 -07:00
Brendan Dahl	8a3277acbc	[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545 ) Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon.	2024-05-09 15:38:13 -07:00
Brendan Dahl	1a2a1fbd7c	[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906 ) Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon.	2024-05-07 11:33:10 -07:00
Heejin Ahn	c921ac724f	[WebAssembly] Enable multivalue return when multivalue ABI is used (#88492 ) Multivalue feature of WebAssembly has been standardized for several years now. I think it makes sense to be able to enable it in the feature section by default for our clang/llvm-produced binaries so that the multivalue feature can be used as necessary when necessary within our toolchain and also when running other optimizers (e.g. wasm-opt) after the LLVM code generation. But some WebAssembly toolchains, such as Emscripten, do not provide both mulvalue-returning and not-multivalue-returning versions of libraries. Also allowing the uses of multivalue in the features section does not necessarily mean we generate them whenever we can to the fullest, which is a different code generation / optimization option. So this makes the lowering of multivalue returns conditional on the use of 'experimental-mv' target ABI. This ABI is turned off by default and turned on by passing `-Xclang -target-abi -Xclang experimental-mv` to `clang`, or `-target-abi experimental-mv` to `clang -cc1` or `llc`. But the purpose of this PR is not tying the multivalue lowering to this specific 'experimental-mv'. 'experimental-mv' is just one multivalue ABI we currently have, and it is still experimental, meaning it is not very well optimized or tuned for performance. (e.g. it does not have the limitation of the max number of multivalue-lowered values, which can be detrimental to performance.) We may change the name of this ABI, or improve it, or add a new multivalue ABI in the future. Also I heard that WASI is planning to add their multivalue ABI soon. So the plan is, whenever any one of multivalue ABIs is enabled, we enable the lowering of multivalue returns in the backend. We currently have only 'experimental-mv' in the repo so we only check for that in this PR. Related past discussions: #82714 https://github.com/WebAssembly/tool-conventions/pull/223#issuecomment-2008298652	2024-04-23 17:48:59 +09:00
Arthur Eubanks	94c988bcfd	[NFC] Remove unused parameter from shouldAssumeDSOLocal()	2024-03-11 19:48:17 +00:00
Heejin Ahn	8506a63bf7	Revert "[WebAssembly] Disable multivalue emission temporarily (#82714 )" This reverts commit 6e6bf9f81756ba6655b4eea8dc45469a47f89b39. It turned out the multivalue feature had active outside users and it could cause some disruptions to them, so I'd like to investigate more about the workarounds before doing this.	2024-02-28 01:02:39 +00:00
Heejin Ahn	6e6bf9f817	[WebAssembly] Disable multivalue emission temporarily (#82714 ) We plan to enable multivalue in the features section soon (#80923) for other reasons, such as the feature having been standardized for many years and other features being developed (e.g. EH) depending on it. This is separate from enabling Clang experimental multivalue ABI (`-Xclang -target-abi -Xclang experimental-mv`), but it turned out we generate some multivalue code in the backend as well if it is enabled in the features section. Given that our backend multivalue generation still has not been much used nor tested, and enabling the feature in the features section can be a separate decision from how much multialue (including none) we decide to generate for now, I'd like to temporarily disable the actual generation of multivalue in our backend. To do that, this adds an internal flag `-wasm-emit-multivalue` that defaults to false. All our existing multivalue tests can use this to test multivalue code. This flag can be removed later when we are confident the multivalue generation is well tested.	2024-02-22 19:17:15 -08:00
Alex Bradbury	197214e39b	[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710 ) This follows on from #76708, allowing `cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just `N->getAsZextVal();` Introduced via `git grep -l "cast<ConstantSDNode>\(.\).getZExtValue" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and then using `git clang-format` on the result.	2024-01-09 12:25:17 +00:00
Benjamin Kramer	858d6a15a0	[wasm] Don't crash on non-simple value types during shuffle combine These still exist during the DAGCombine phase.	2023-10-24 12:35:43 +02:00
Björn Pettersson	4acb96c99f	[SelectionDAG] Tidy up around endianness and isConstantSplat (#68212 ) The BuildVectorSDNode::isConstantSplat function could depend on endianness, and it takes a bool argument that can be used to indicate if big or little endian should be considered when internally casting from a vector to a scalar. However, that argument is default set to false (= little endian). And in many situations, even in target generic code such as DAGCombiner, the endianness isn't specified when using the function. The intent with this patch is to highlight that endianness doesn't matter, depending on the context in which the function is used. In DAGCombiner the code is slightly refactored. Back in the days when the code was written it wasn't possible to request a MinSplatBits size when calling isConstantSplat. Instead the code re-expanded the found SplatValue to match with the EltBitWidth. Now we can just provide EltBitWidth as MinSplatBits and remove the logic for doing the re-expand. While being at it, tidying up around isConstantSplat, this patch also adds an explicit check in BuildVectorSDNode::isConstantSplat to break out from the loop if trying to split an on VecWidth into two halves. Haven't been able to prove that there could be miscompiles involved if not doing so. There are lit tests that trigger that scenario, although I think they happen to later discard the returned SplatValue for other reasons.	2023-10-16 14:53:53 +02:00
Paulo Matos	a29e8ef1c3	[WebAssembly] Add path to PIC mode for wasm tables (#67545 ) Currently tables cannot be shared between compilation units, therefore no special treatment is needed for tables. Fixes #65191	2023-10-03 08:00:21 +02:00
Yolanda Chen	291101aa8e	[WebAssembly] Optimize vector shift using a splat value from outside block The vector shift operation in WebAssembly uses an i32 shift amount type, while the LLVM IR requires binary operator uses the same type of operands. When the shift amount operand is splated from a different block, the splat source will not be exported and the vector shift will be unrolled to scalar shifts. This patch enables the vector shift to identify the splat source value from the other block, and generate expected WebAssembly bytecode when lowering. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D158399	2023-08-25 08:13:27 -07:00
Reid Kleckner	984dc4b9cd	[WebAssembly] Create separation between MC and CodeGen layers Move WebAssemblyUtilities from Utils to the CodeGen library. It primarily deals in MIR layer types, so it really lives in the CodeGen library. Move a variety of other things around to try create better separation. See issue #64166 for more info on layering. Move llvm/include/CodeGen/WasmAddressSpaces.h back to llvm/lib/Target/WebAssembly/Utils. Differential Revision: https://reviews.llvm.org/D156472	2023-08-18 14:08:37 -07:00
Thomas Lively	4f065fcb57	[WebAssembly] Fix incorrect assertion in SIMD reduction codegen The codegen routine introduced in 18077e9fd688 did not account for vectors with more than 16 lanes. Remove the incorrect assertion and bail out of the optimization when encountering this case. Add test cases that previously triggered the assertion. Unfortunately, these test cases now have terrible codegen, but that is at least better than crashing. Fixes #63500. Differential Revision: https://reviews.llvm.org/D154124	2023-06-30 11:30:18 -07:00
xortoast	bb648c9177	[WebAssembly] Add lowering for llvm.rint and llvm.roundeven WebAssembly doesn't expose inexact exceptions, so frint can be mapped to fnearbyint. Likewise, WebAssembly always rounds ties-to-even, so froundeven can be mapped to fnearbyint. Differential Revision: https://reviews.llvm.org/D153451	2023-06-23 14:07:11 -07:00
Paulo Matos	55aeb23fe0	[clang][WebAssembly] Implement support for table types and builtins This commit implements support for WebAssembly table types and respective builtins. Table tables are WebAssembly objects to store reference types. They have a large amount of semantic restrictions including, but not limited to, only being allowed to be declared at the top-level as static arrays of zero-length. Not being arguments or result of functions, not being stored ot memory, etc. This commit introduces the __attribute__((wasm_table)) to attach to arrays of WebAssembly reference types. And the following builtins to manage tables: * ref __builtin_wasm_table_get(table, idx) * void __builtin_wasm_table_set(table, idx, ref) * uint __builtin_wasm_table_size(table) * uint __builtin_wasm_table_grow(table, ref, uint) * void __builtin_wasm_table_fill(table, idx, ref, uint) * void __builtin_wasm_table_copy(table, table, uint, uint, uint) This commit also enables reference-types feature at bleeding-edge. This is joint work with Alex Bradbury (@asb). Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D139010	2023-06-10 15:53:13 +02:00
Caleb Zulawski	18077e9fd6	[WebAssembly] Re-land 8392bf6000ad Correctly handle single-element vectors to fix an assertion failure. Add tests that were missing from the original commit. Differential Revision: D151782	2023-06-09 08:42:27 -07:00
Thomas Lively	100c756d96	Revert "Improve WebAssembly vector bitmask, mask reduction, and extending" This reverts commit 8392bf6000ad039bd0e55383d40a05ddf7b4af13. The commit missed some edge cases that led to crashes. Reverting to resolve downstream breakage while a fix is pending.	2023-06-08 14:36:29 -07:00
Caleb Zulawski	8392bf6000	Improve WebAssembly vector bitmask, mask reduction, and extending This is inspired by a recently filed Rust issue noting poor codegen for vector masks (https://github.com/rust-lang/portable-simd/issues/351). Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D151782	2023-06-07 10:20:22 -07:00
Thomas Lively	72a72315b0	[WebAssembly] Mark @llvm.wasm.shuffle lane indices as immediates This intrinsic is meant to lower directly to the i8x16.shuffle instruction, which takes its lane index arguments as immmediates. The ISel for the intrinsic assumed that the lane index arguments were constants, so bitcode that "incorrectly" used this intrinsic with non-immediate arguments caused an assertion failure in the backend. Avoid the crash by defining the lane index arguments to be immediates, matching the underlying instruction. Update ISel accordingly. This change means that the bitcode that previously caused a crash will now fail to validate. Fixes #55559. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D149898	2023-05-05 08:12:41 -07:00
Peter Rong	3b2476910b	[WASM] Prevent casting `undef` to `CosntantSDNode` WebAssembly tries to cast an `undef` to `CosntantSDNode` during `LowerAccessVectorElement`. These operations will trigger an assertion error in cast. To avoid this issue, we prevent casting, and abort the lowering operation. A unit test is also included. This patch fixes [pr61828](https://github.com/llvm/llvm-project/issues/61828) Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D147198	2023-03-30 20:14:11 -07:00
Peter Rong	51a93828d7	[WASM] Fix legalizer for LowerBUILD_VECTOR. Constants in BUILD_VECTOR may be down cast into a smaller value that fits LaneBits, i.e., the bit width of elements in the vector. This cast didn't consider 2^N where it would be cast into -2^N, which still doesn't fit into LaneBits after casting. This will cause an assertion in later legalization. 2^N should be cast into 0, and this patch reflects such behavior. This patch also includes a test to reflect the fix. This patch fixes [issue 61780](https://github.com/llvm/llvm-project/issues/61780) Related patch: https://reviews.llvm.org/D108669 Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D147208	2023-03-30 19:20:04 -07:00
Peter Rong	163d7bb941	[WASM] Precommit WebAssemblyISelLowering.cpp format changes for D147198 Signed-off-by: Peter Rong <PeterRong96@gmail.com>	2023-03-29 22:18:53 -07:00

1 2 3 4 5 ...

398 Commits