llvm-project

Author	SHA1	Message	Date
Yeting Kuo	87af9ee870	[RISCV] Use experimental.vp.splat to splat specific vector length elements. (#101329 ) Previously, llvm IR is hard to create a scalable vector splat with a specific vector length, so we use riscv.vmv.v.x and riscv.vmv.v.f to do this work. But the two rvv intrinsics needs strict type constraint which can not support fixed vector types and illegal vector types. Using vp.splat could preserve old functionality and also generate more optimized code for vector types and illegal vectors. This patch also fixes crash for getEVT not serving ptr types.	2024-08-01 09:37:42 +08:00
Piotr Fusik	f0ac8903ea	[RISCV][NFC] Fix intrinsic misspelled in a comment (#98998 )	2024-07-17 07:07:33 +02:00
Luke Lau	563ae62095	[RISCV] Don't expand zero stride vp.strided.load if SEW>XLEN (#98924 ) A splat of a <n x i64> on RV32 will get lowered as a zero strided load anyway (and won't match any .vx splat patterns), so don't expand it to a scalar load + splat to avoid writing it to the stack.	2024-07-16 10:29:53 +08:00
Luke Lau	d5f4f084d2	[RISCV] Always expand zero strided vp.strided.load (#98901 ) This patch makes zero strided VP loads always be expanded to a scalar load and splat even if +optimized-zero-stride-load is present. Expanding it allows more .vx splat patterns to be matched, which is needed to prevent regressions in #98111. If the feature is present, RISCVISelDAGToDAG will combine it back to a zero strided load. The RV32 test diff also shows how need to emit a zero strided load either way after expanding an SEW=64 strided load. We could maybe fix this in a later patch by not doing the expand if SEW>XLEN.	2024-07-15 23:54:00 +08:00
Yeting Kuo	94279ae4ca	[RISCV] Recommit "Expand vp.stride.load to splat of a scalar load." (#98579 ) This is a recommit of #98140. The old commit should be rebased on #98205 which changes the feature of hardware zero stride optimization. It's a similar patch as a214c521f8763b36dd400b89017f74ad5ae4b6c7 for vp.stride.load. Some targets prefer pattern (vmv.v.x (load)) instead of vlse with zero stride.	2024-07-15 16:09:56 +08:00
Nico Weber	cea7bad732	Revert "[RISCV] Expand vp.stride.load to splat of a scalar load." (#98422 ) Reverts llvm/llvm-project#98140 Breaks tests, see comments on the PR.	2024-07-10 20:54:41 -04:00
Yeting Kuo	cda245a339	[RISCV] Expand vp.stride.load to splat of a scalar load. (#98140 ) It's a similar patch as a214c521f8763b36dd400b89017f74ad5ae4b6c7 for vp.stride.load. Some targets prefer pattern (vmv.v.x (load)) instead of vlse with zero stride. It's IR version of #97798.	2024-07-11 08:38:31 +08:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Craig Topper	9396891271	[RISCV] Don't look for sext in RISCVCodeGenPrepare::visitAnd. We want to know the upper 33 bits of the And Input are zero. SExt only guarantees they are the same. We originally checked for SExt or ZExt when we were using isImpliedByDomCondition because a ZExt may have been changed to SExt before we visited the And. We are no longer using isImpliedByDomCondition so we can only look for zext with the nneg flag. While here, switch to PatternMatch to simplify the code. Fixes #78783	2024-01-19 14:44:47 -08:00
Luke Lau	15b0fabb21	[RISCV] Vectorize phi for loop carried @llvm.vector.reduce.fadd (#78244 ) LLVM vector reduction intrinsics return a scalar result, but on RISC-V vector reduction instructions write the result in the first element of a vector register. So when a reduction in a loop uses a scalar phi, we end up with unnecessary scalar moves: loop: vfmv.s.f v10, fa0 vfredosum.vs v8, v8, v10 vfmv.f.s fa0, v8 This mainly affects ordered fadd reductions, which has a scalar accumulator operand. This tries to vectorize any scalar phis that feed into a fadd reduction in RISCVCodeGenPrepare, converting: loop: %phi = phi <float> [ ..., %entry ], [ %acc, %loop] %acc = call float @llvm.vector.reduce.fadd.nxv4f32(float %phi, <vscale x 2 x float> %vec) ``` to loop: %phi = phi <vscale x 2 x float> [ ..., %entry ], [ %acc.vec, %loop] %phi.scalar = extractelement <vscale x 2 x float> %phi, i64 0 %acc = call float @llvm.vector.reduce.fadd.nxv4f32(float %x, <vscale x 2 x float> %vec) %acc.vec = insertelement <vscale x 2 x float> poison, float %acc.next, i64 0 Which eliminates the scalar -> vector -> scalar crossing during instruction selection.	2024-01-18 16:15:20 +07:00
Yingwei Zheng	d64d5ea102	[RISCV][CodeGenPrepare] Remove duplicated transform for zext. NFC. (#72053 ) After #71534 and #72052, the transform `zext -> zext nneg` in `RISCVCodeGenPrepare` is redundant.	2023-11-13 22:45:33 +08:00
Philip Reames	784a2cd561	[RISCV] Rewrite RISCVCodeGenPrepare using zext nneg [nfc-ish] (#70739 ) This stacks on #70725. Once we have lowering for zext nneg, we can rewrite all of the existing RISCVCodeGenPrepare login in terms of zext nneg instead of sext. The change isn't NFC from the perspective of the individual pass, but should be from the perspective of codegen as a whole. As noted in the TODO, one piece can be moved to instcombine, but I'll leave that to a separate commit.	2023-10-30 16:35:30 -07:00
Craig Topper	0f4c9c016c	[RISCV] Replace RISCV->RISC-V in strings. To be consistent with RISC-V branding guidelines https://riscv.org/about/risc-v-branding-guidelines/ Think we should be using RISC-V where possible. D146449 already updated comments. Strings may have more user impact. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146451	2023-03-27 09:50:17 -07:00
Craig Topper	29463612d2	[RISCV] Replace RISCV -> RISC-V in comments. NFC To be consistent with RISC-V branding guidelines https://riscv.org/about/risc-v-branding-guidelines/ Think we should be using RISC-V where possible. More patches will follow. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146449	2023-03-27 09:50:17 -07:00
Craig Topper	37db283362	[RISCV] isImpliedByDomCondition returns an Optional<bool> not a bool. We were incorrectly checking that it returned an implicaton result, not that the implication result itself was true.	2022-08-12 22:21:05 -07:00
Craig Topper	f19497f7b0	[RISCV] Use InstVisitor in RISCVCodeGenPrepare. NFC Makes it easy to add new instructions to look at without dispatching manually.	2022-08-02 21:19:30 -07:00
Craig Topper	1db6d6dcd8	[RISCV] Teach RISCVCodeGenPrepare to optimize (zext (abs(i32 X, i1 1))). (abs(i32 X, i1 1) always produces a positive result. The 'i1 1' means INT_MIN input produces poison. If the result is sign extended, InstCombine will convert it to zext. This does not produce ideal code for RISCV. This patch reverses the zext back to sext which can be folded into a subw or negw. Ideally we'd do this in SelectionDAG, but we lose the INT_MIN poison flag when llvm.abs becomes ISD::ABS. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D130412	2022-07-25 09:36:41 -07:00
Craig Topper	8cc483099a	[RISCV] Teach RISCVCodeGenPrepare to optimize (i64 (and (zext/sext (i32 X), C1))) If X is known positive by a dominating condition, we can fill in ones into the upper bits of C1 if that would allow it to become an simm12 allowing the use of ANDI. This pattern often occurs in unrolled loops where the induction variable has been widened. To get the best benefit from this, I had to move the pass above ConstantHoisting which is in addIRPasses. Otherwise the AND constant is often hoisted away from the AND. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D129888	2022-07-17 11:00:56 -07:00
Craig Topper	73f766ca9a	[RISCV] Remove unnecessary use of IRBuilder from RISCVCodeGenPrepare. We're creating single instruction to replace another instruction. We can insert using the InsertBefore operand of the constructor. Then copy the debug location.	2022-07-17 10:59:54 -07:00
Craig Topper	1a8468ba61	[RISCV] Add a RISCV specific CodeGenPrepare pass. Initial optimization is to convert (i64 (zext (i32 X))) to (i64 (sext (i32 X))) if the dominating condition for the basic block guaranteed the sign bit of X is zero. This frequently occurs in loop preheaders where a signed induction variable that can never be negative has been widened. There will be a dominating check that the 32-bit trip count isn't negative or zero. The check here is not restricted to that specific case though. A i32->i64 sext is cheaper than zext on RV64 without the Zba extension. Later optimizations can often remove the sext from the preheader basic block because the dominating block also needs a sext to evaluate the greater than 0 check. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D129732	2022-07-14 10:20:59 -07:00

20 Commits