llvm-project

Author	SHA1	Message	Date
Pengcheng Wang	18f0f70934	[RISCV] Support llvm.masked.expandload intrinsic (#101954 ) We can use `viota`+`vrgather` to synthesize `vdecompress` and lower expanding load to `vcpop`+`load`+`vdecompress`. And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes #101914.	2024-10-31 20:03:58 +08:00
Luke Lau	6da5968f5e	[RISCV] Lower scalar_to_vector for supported FP types (#114340 ) In https://reviews.llvm.org/D147608 we added custom lowering for integers, but inadvertently also marked it as custom for scalable FP vectors despite not handling it. This adds handling for floats and marks it as custom lowered for fixed-length FP vectors too. Note that this doesn't handle bf16 or f16 vectors that would need promotion, but these scalar_to_vector nodes seem to be emitted when expanding them.	2024-10-31 13:15:17 +08:00
Craig Topper	55dbacbf07	[RISCV] Remove RISCVISD::VFCVT_X(U)_F_VL by using VFCVT_RM_X(U)_F_VL with DYN rounding mode. NFC (#114306 )	2024-10-30 19:16:23 -07:00
Yingwei Zheng	cf9d1c1486	[SDAG] Simplify `SDNodeFlags` with bitwise logic (#114061 ) This patch allows using enumeration values directly and simplifies the implementation with bitwise logic. It addresses the comment in https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.	2024-10-31 08:10:07 +08:00
Luke Lau	96f5c68350	[RISCV] Lower @llvm.experimental.vector.compress for zvfhmin/zvfbfmin (#113770 ) This is a follow up to #113291 and handles f16/bf16 with zvfhmin and zvfbmin.	2024-10-28 09:37:06 +00:00
Pengcheng Wang	b799cc3418	[RISCV] Add lowering for @llvm.experimental.vector.compress (#113291 ) This intrinsic was introduced by #92289 and currently we just expand it for RISC-V. This patch adds custom lowering for this intrinsic and simply maps it to `vcompress` instruction. Fixes #113242.	2024-10-23 14:22:32 +08:00
Sam Elliott	9b9c2a082c	[RISCV][NFC] Move RISCVISD::TAIL beside RISCVISD::CALL	2024-10-22 11:12:58 -07:00
Craig Topper	1bc1a79a65	[RISCV] Support inline assembly 'f' constraint for Zfinx. (#112986 ) This would allow some inline assembly code to work with either F or Zfinx. This appears to match gcc behavior.	2024-10-18 18:17:23 -07:00
Sam Elliott	03dcd88c78	[RISCV][ISel] Ensure 'in X' Constraints prevent X0 (#112563 ) I'm not sure if this fix is required, but I've written the patch anyway. This does not cause test changes, but we haven't got tests that try to use all 32 registers in inline assembly. Broadly, for GPRs, we made the explicit choice that `r` constraints would never attempt to use `x0`, because `x0` isn't really usable like the other GPRs. I believe the same thing applies to `Zhinx`, `Zfinx` and `Zdinx` because they should not be allocating operands to `x0` either, so this patch introduces new `NoX0` classes for `GPRF16` and `GPRF32` registers, and uses them with inline assembly. There is also a `GPRPairNoX0` for the `Zdinx` case on rv32, avoiding use of the `x0` pair which has different behaviour to the other GPR pairs.	2024-10-18 22:33:35 +01:00
Sam Elliott	228f88fdc8	[RISCV] Inline Assembly: RVC constraint and N modifier (#112561 ) This change implements support for the `cr` and `cf` register constraints (which allocate a RVC GPR or RVC FPR respectively), and the `N` modifier (which prints the raw encoding of a register rather than the name). The intention behind these additions is to make it easier to use inline assembly when assembling raw instructions that are not supported by the compiler, for instance when experimenting with new instructions or when supporting proprietary extensions outside the toolchain. These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92 As part of the implementation, I felt there was not enough coverage of inline assembly and the "in X" floating-point extensions, so I have added more regression tests around these configurations.	2024-10-18 10:40:38 +01:00
Roger Ferrer Ibáñez	9d469b5988	[RISCV] Implement trampolines for rv64 (#96309 ) This is implementation is based on what the X86 target does but emitting the instructions that GCC emits for rv64. --------- Co-authored-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>	2024-10-18 08:06:47 +02:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Nikita Popov	255a99c29f	[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309 ) This fixes all the places that hit the new assertion added in https://github.com/llvm/llvm-project/pull/106524 in tests. That is, cases where the value passed to the APInt constructor is not an N-bit signed/unsigned integer, where N is the bit width and signedness is determined by the isSigned flag. The fixes either set the correct value for isSigned, set the implicitTrunc flag, or perform more calculations inside APInt. Note that the assertion is currently still disabled by default, so this patch is mostly NFC.	2024-10-17 08:48:08 +02:00
Luke Lau	2b6b7f664d	[RISCV] Mark math functions as expanded for zvfhmin/zvfbfmin (#112508 ) For regular floating point types we mark these as expanded on scalable vectors so they're not legal in the cost model, so this does the same for f16 w/ zvfhmin and bf16.	2024-10-16 21:40:37 +01:00
Luke Lau	e88bcc1204	[RISCV] Lower vector_splice on zvfhmin/zvfbfmin (#112579 ) Similar to other permutation ops, we can just reuse the existing lowering.	2024-10-16 21:40:18 +01:00
Luke Lau	f6c23222a4	[RISCV] Promote fixed-length bf16 arith vector ops with zvfbfmin (#112393 ) The aim is to have the same set of promotions on fixed-length bf16 vectors as on fixed-length f16 vectors, and then deduplicate them similarly to what was done for scalable vectors. It looks like fneg/fabs/fcopysign end up getting expanded because fsub is now legal, and the default operation action must be expand.	2024-10-15 22:49:05 +01:00
YunQiang Su	c01ddbe916	RISC-V: Select FCANONICALIZE (#112083 ) We can use `FMIN.x OP,OP` to canonlize a float.	2024-10-14 14:12:36 +08:00
Jim Lin	dba54fb074	[RISCV] Add support for inline asm constraint vd (#111653 ) It constrains vector registers excluding v0. Refer to https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html RISC-V part. This patch also adds a testcase for constraints vr, vd and vm.	2024-10-14 10:47:59 +08:00
Craig Topper	902520256b	[RISCV] Make (sext_inreg X, i1) legal for XTHeadBb to cover the existing isel pattern. I just happened to notice the untested isel pattern.	2024-10-11 16:16:07 -07:00
Daniel Mokeev	26b832a9ec	[RISCV] Add DAG combine to turn (sub (shl X, 8-Y), (shr X, Y)) into orc.b (#111828 ) This patch generalizes the DAG combine for `(sub (shl X, 8), X) => (orc.b X)` into the more general form of `(sub (shl X, 8 - Y), (srl X, Y)) => (orc.b X)`. Alive2 generalized proof: https://alive2.llvm.org/ce/z/dFcf_n Related issue: https://github.com/llvm/llvm-project/issues/96595 Related PR: https://github.com/llvm/llvm-project/pull/96680	2024-10-11 20:41:47 +08:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Luke Lau	a3cd269fbe	[RISCV] Remove {s,u}int_to_fp custom op action for f16/bf16 (#111471 ) It turns out that {s,u}int_to_fp nodes get their operation action from their operand's type, not the result type, so we don't need to set it for fp16 or bf16. vp_{s,u}int_to_fp uses the result type though so we need to keep it. This also means that we can lower int_to_fp for fixed length bf16 vectors already, so this adds tests for that. The cost model test changes are due to BasicTTIImpl's getCastInstrCost not taking into account that int_to_fp needs its legal type swapped. This can be fixed in a later patch, but its worth noting that the affected types in the tests currently crash when lowered anyway (due to them needing split at LMUL > 8)	2024-10-10 14:40:24 +01:00
Jeffrey Byrnes	853c43d04a	[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564 ) Porting to TTI provides direct access to the instruction cost model, which can enable instruction cost based sinking without introducing code duplication.	2024-10-09 14:30:09 -07:00
Yingwei Zheng	9cf8c094c7	[RISCV][DAGCombine] Combine `sext_inreg (shl X, Y), i32` into `sllw X, Y` (#111101 ) Alive2: https://alive2.llvm.org/ce/z/ncf36D	2024-10-04 16:03:09 +08:00
Luke Lau	487686b82e	[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000 ) In https://reviews.llvm.org/D153848, promotion was added for a variety of f16 ops with zvfhmin, including VP reductions. However I don't believe it's correct to promote f16 fadd or fmul reductions to f32 since we need to round the intermediate results. Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two different results depending on whether we compiled with +zvfh or +zvfhmin, for example with a 3 element reduction: ; v9 = [0.1563, 5.97e-8, 0.00006104] ; zvfh vsetivli x0, 3, e16, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v9, v8 vfmv.f.s fa0, v8 ; fa0 = 0.1563 ; zvfhmin vsetivli x0, 3, e16, m1, ta, ma vfwcvt.f.f.v v10, v9 vsetivli x0, 3, e32, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v10, v8 vfmv.f.s fa0, v8 fcvt.h.s fa0, fa0 ; fa0 = 0.1564 This same thing happens with reassociative reductions e.g. vfredusum.vs, and this also applies for bf16. I couldn't find anything in the LangRef for reductions that suggest the excess precision is allowed. There may be something we can do in Clang with -fexcess-precision=fast, but I haven't looked into this yet. I presume the same precision issue occurs with fmul, but not with fmin/fmax/fminimum/fmaximum. I can't think of another way of lowering these other than scalarizing, and we can't scalarize scalable vectors, so this just removes the promotion and adjusts the cost model to return an invalid cost. (It looks like we also don't currently cost fmul reductions, so presumably they also have an invalid cost?) I think this should be enough to stop the loop vectorizer or SLP from emitting these intrinsics.	2024-10-04 00:17:45 +08:00
Keith Packard	ca57e8f23f	[RISCV] Support -mstack-protector-guard=tls (#108942 ) Add support for using a thread-local variable with a specified offset for holding the stack guard canary value. Closes: #46685	2024-10-02 16:33:31 -07:00
Luke Lau	1fa4a74d53	[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin (#110221 ) This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.	2024-10-02 15:26:25 +08:00
Luke Lau	30f58ab17f	[RISCV] Lower vector_reverse for zvfhmin/zvfbfmin (#110218 ) Previously we crashed because we had no lowering for f16/bf16 scalable vectors. Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns for it.	2024-10-02 14:25:15 +08:00
Craig Topper	8ed18eded9	[RISCV] Add correct MachinePointerInfo when putting arguments on the stack. (#110140 ) Previously we used an empty MachinePointerInfo. I checked a few other targets like X86, ARM, and AArch64 and they all appear to use correct MachinePointerInfo.	2024-09-27 13:48:01 -07:00
Jesse Huang	9bdcf7aa18	[RISCV] Software guard direct calls in large code model (#109377 ) Support for large code model are added recently, and sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal with Zicfilp because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered. Change direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)	2024-09-27 13:04:16 +08:00
Luke Lau	2b84ef06ac	[RISCV] Handle f16/bf16 extract_vector_elt when scalar type is legal (#110144 ) When the scalar type is illegal, it gets softened during type legalization and gets lowered as an integer. However with zfhmin/zfbfmin the type is now legal and it passes through type legalization where it crashes because we didn't have any custom lowering or patterns for it. This handles said case via the existing custom lowering to a vslidedown and vfmv.f.s. It also handles the case where we only have zvfhmin/zvfbfmin and don't have vfmv.f.s, in which case we need to extract it to a GPR and then use fmv.h.x. Fixes #110126	2024-09-27 08:00:59 +08:00
Craig Topper	bd592b11c3	[RISCV] Minor cleanups to lowerInterleaveIntrinsicToStore and lowerDeinterleaveIntrinsicToLoad. NFC -Reduce the scope of some variables. -Use getArgOperand instead of getOperand to get intrinsic operands. -Use initialize_list instead of a SmallVector. -Remove wide VectorType variable that is only used to check fixed vs scalable. We can use the narrow VectorType for that.	2024-09-25 21:37:37 -07:00
Craig Topper	cf1de0a7b4	[RISCV] Reuse Factor variable instead of hardcoding 2 in other places. NFC	2024-09-25 16:36:18 -07:00
Luke Lau	f172c31a57	[RISCV] Lower memory ops and VP splat for zvfhmin and zvfbfmin (#109387 ) We can lower f16/bf16 memory ops without promotion through the existing custom lowering. Some of the zero strided VP loads get combined to a VP splat, so we need to also handle the lowering for that for f16/bf16 w/ zvfhmin/zvfbfmin. This patch copies the lowering from ISD::SPLAT_VECTOR over to lowerScalarSplat which is used by the VP splat lowering.	2024-09-26 01:47:46 +08:00
Craig Topper	3c348bf543	[RISCV] Fold (fmv_x_h/w (load)) to an integer load. (#109900 )	2024-09-25 10:29:44 -07:00
Kazu Hirata	cd53c8429e	[RISCV] Fix a warning This patch fixes: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:10479:12: error: variable 'SubRegIdx' set but not used [-Werror,-Wunused-but-set-variable]	2024-09-24 16:22:53 -07:00
Craig Topper	1f9ca89798	[RISCV] Don't create insert/extract subreg during lowering. (#109754 ) Create the equivalent INSERT_SUBVECTOR/EXTRACT_SUBVECTOR instead. When we tried porting this to global isel, we noticed that subreg operations are created early. We aren't able to do this until instruction selection in global isel. For SelectionDAG, it makes sense to use insert/extract_subvector as the canonical form for these operations pre-isel. If it had come into SelectionDAG as a insert/extract_subvector we would have kept it in that form.	2024-09-24 15:54:49 -07:00
Craig Topper	e64673d317	[RISCV] Treat insert_subvector into undef with index==0 as legal. (#109745 ) Regardless of fixed and scalable type. We can always use subreg ops. We don't need to do any container conversion.	2024-09-24 09:49:32 -07:00
Piotr Fusik	cc7b24a4d1	[NFC] Fix typos in comments (#109765 )	2024-09-24 11:19:56 +02:00
Craig Topper	23558afaf2	[RISCV] Hoist duplicate code in lowerINSERT_SUBVECTOR. NFC (#109733 )	2024-09-23 19:32:33 -07:00
Craig Topper	af1cf699f0	[RISCV] Move OrigIdx == 0 check to start of lowerEXTRACT_SUBVECTOR. NFC (#109731 ) Allows us to remove a separate check of OrigIdx != 0 for the mask case.	2024-09-23 18:21:59 -07:00
Craig Topper	079f31c11f	[RISCV] Move the rest of Zfa FLI instruction handling to lowerConstantFP. (#109217 ) We already moved the fneg case. This moves the rest so we can drop the custom isel.	2024-09-19 15:16:10 -07:00
Craig Topper	8e4909aa19	[RISCV] Remove unnecessary vand.vi from vXi1 and nvXvi1 VECTOR_REVERSE codegen. (#109071 ) Use a setne with 0 instead of a trunc. We know we zero extended the node so we can get by with a non-zero check only. The truncate lowering doesn't know that we zero extended so has to mask the lsb. I don't think DAG combine sees the trunc before we lower it to RISCVISD nodes so we don't get a chance to use computeKnownBits to remove the AND.	2024-09-18 09:43:48 -07:00
Luke Lau	737f56fdf7	[RISCV] Deduplicate zvfhmin and zvfbfmin operation actions. NFC After #108937 fp16 w/o zvfh and bf16 are now in sync and should have the same lowering.	2024-09-18 18:07:11 +08:00
Luke Lau	edac1b2d63	[RISCV] Promote bf16 ops to f32 with zvfbfmin (#108937 ) For f16 with zvfhmin, we promote most ops and VP ops to f32. This does the same for bf16 with zvfbfmin, so the two fp types should now be in sync. There are a few places in the custom lowering where we need to check for a LMUL 8 f16/bf16 vector that can't be promoted and must be split, this extracts that out into isPromotedOpNeedingSplit. In a follow up NFC we can deduplicate the code that sets up the promotions.	2024-09-18 17:39:40 +08:00
Luke Lau	8d7d4c25cb	[RISCV] Split fp rounding ops with zvfhmin nxv32f16 (#108765 ) This adds zvfhmin test coverage for fceil, ffloor, fnearbyint, frint, fround and froundeven and splits them at nxv32f16 to avoid crashing, similarly to what we do for other nodes that we promote. This also sets ftrunc to promote which was previously missing. We already promote the VP version of it, vp_froundtozero. Marking it as promoted affects some of the cost model tests since they're no longer expanded.	2024-09-18 16:36:13 +08:00
Mikhail R. Gadelha	d2125e1db6	[RISCV] Support STRICT_UINT_TO_FP and STRICT_SINT_TO_FP (#102503 ) This patch adds support for the missing STRICT_UINT_TO_FP and STRICT_SINT_TO_FP for riscv and adds a test case for rv32 which was previously crashing. The code is in line with how other strict_* nodes are handled (e.g., getting op(1) instead of op(0) when it's a strict node, as op(0) in a strict node is the entry token).	2024-09-17 11:21:52 -03:00
Luke Lau	6af2f225a0	[RISCV] Restrict combineOp_VLToVWOp_VL w/ bf16 to vfwmadd_vl with zvfbfwma (#108798 ) We currently make sure to check that if folding an op to an f16 widening op that we have zvfh. We need to do the same for bf16 vectors, but with the further restriction that we can only combine vfmadd_vl to vfwmadd_vl (to get vfwmaccbf16.v{v,f}). The added test case currently crashes because we try to fold an add to a bf16 widening add, which doesn't exist in zvfbfmin or zvfbfwma This moves the checks into the extension support checks to keep it one place.	2024-09-17 13:35:25 +08:00
Kazu Hirata	1e4e1ceeeb	[Target] Avoid repeated hash lookups (NFC) (#108677 )	2024-09-14 07:39:09 -07:00
Craig Topper	ee4582f9c8	[RISCV] Use CCValAssign::getCustomReg for fixed vector arguments/returns with RVV. (#108470 ) We need to insert a insert_subvector or extract_subvector which feels pretty custom. This should make it easier to support fixed vector arguments for GISel.	2024-09-13 07:23:44 -07:00

1 2 3 4 5 ...

1752 Commits