llvm-project

Author	SHA1	Message	Date
Craig Topper	e94dc58dff	[RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven. This avoids the call overhead as well as the the save/restore of fflags and the snan handling in the libm function. The save/restore of fflags and snan handling are needed to be correct for -ftrapping-math. I think we can ignore them in the default environment. The inline sequence will generate an invalid exception for nan and an inexact exception if fractional bits are discarded. I've used a custom inserter to explicitly create the control flow around the float->int->float conversion. We can probably avoid the final fsgnj after the conversion for no signed zeros FMF, but I'll leave that for future work. Note the comparison constant is slightly different than glibc uses. They use 1<<53 for double, I'm using 1<<52. I believe either are valid. Numbers >= 1<<52 can't have any fractional bits. It's ok to do the float->int->float conversion on numbers between 1<<53 and 1<<52 since they will all fit in 64. We only have a problem if the double can't fit in i64 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136508	2022-10-26 14:36:49 -07:00
Craig Topper	0a03240fb4	[RISCV] Add tests for fixed vector sshl_sat/ushl_sat. NFC	2022-10-26 14:15:47 -07:00
Piyou Chen	7d7940fd77	[RISCV] add svinval extension 1. Add the svinval extension support 2. Add the svinval Predicates for its instruction Note: the svinval instructions defined in https://reviews.llvm.org/D117654 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136571	2022-10-26 09:45:30 -07:00
Craig Topper	a61b74889f	[RISCV] Use vslide1down for i64 insertelt on RV32. Instead of using vslide1up, use vslide1down and build the other direction. This avoids the overlap constraint early clobber of vslide1up. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136735	2022-10-26 09:43:12 -07:00
Craig Topper	a54f3347e8	[RISCV] Add shift amount operands of shift, rotate, and Zbs instructions to hasAllNBitUsers.	2022-10-24 22:07:22 -07:00
Craig Topper	223f466f4f	[RISCV] Add ORI to hasAllNBitUsers. If the immediate is negative with sufficient leading ones, then the upper bits of the other operand aren't demanded.	2022-10-24 21:33:17 -07:00
Craig Topper	1fa8fd4c33	Recommit "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit 65aaecca8842dec30d03734a7fe8ce33c5afec81. There was an ordering problem in the calculation of the partial remainder. Original commit message: If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Differential Revision: https://reviews.llvm.org/D135541	2022-10-24 10:08:50 -07:00
Craig Topper	65aaecca88	Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit f6a7b47820904c5e69cc4f133d382c74a87c44e8. I received a report that this fails on 32-bit X86.	2022-10-24 07:12:54 -07:00
Piyou Chen	f8b8426861	[RISCV] Add Svnapot extension Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D136570	2022-10-24 01:27:04 -07:00
Craig Topper	a41c1f3168	[RISCV] Make selectShiftMask look for negate opportunities after looking through AND. Previously we would only look for an AND or a negate. But its possible there is a negate after looking through the AND.	2022-10-23 14:23:13 -07:00
Craig Topper	f6a7b47820	[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant. If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135541	2022-10-22 23:35:33 -07:00
Craig Topper	db25f51e37	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit e8b3ffa532b8ebac5dcdf17bb91b47817382c14d. The AMDGPU/mad_64_32.ll seems to fail on some of the build bots but passes locally. I'm really confused.	2022-10-22 22:50:43 -07:00
Craig Topper	e8b3ffa532	[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y)) (sra X, BW-1) is either 0 or -1. So the multiply is a conditional negate of Y. This pattern shows up when type legalizing wide multiplies involving a sign extended value. Fixes PR57549. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133399	2022-10-22 21:51:45 -07:00
Craig Topper	00816714f9	[DAGCombiner][RISCV] Make foldBinOpIntoSelect work correctly with opaque constants. The CanFoldNonConst doesn't work correctly with opaque constants because getNode won't constant fold constants if one is opaque. Even if the operation is AND/OR. This can lead to infinite loops. This patch does the folding manually in the DAGCombine. Alternatively, we could improve getNode but that seemed likely to have bigger impact and possibly increase compile time for the additional checks. We wouldn't want to directly constant fold because we need to preserve the opaque flag. Fixes PR58511. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D136472	2022-10-22 19:10:33 -07:00
Craig Topper	4830fa18aa	[RISCV] Make sure we always call tryShrinkShlLogicImm for ISD:AND during isel. There was an early out that prevented us from calling this for (and (sext_inreg (shl X, C1), i32), C2).	2022-10-22 14:30:13 -07:00
Anton Sidorenko	14a5b9cdae	[MachineCombiner][RISCV] Relax optimization level requirement Enable Machine Combiner for O1/O2/O3 optimization levels. It makes RISCV consistent with other targets running Machine Combiner. Originally it was enabled only for -O3, however I looked through time reports and usually it takes 0.1%-0.4% of total time, and never takes more than 1.0%. Differential Revision: https://reviews.llvm.org/D136339	2022-10-21 13:25:28 +03:00
Craig Topper	2c82080f09	[MachineFrameInfo][RISCV] Call ensureStackAlignment for objects created with scalable vector stack id. This is an alternative to fix PR57939 for RISC-V. It definitely can be argued that the stack temporaries for RISC-V are being created with an unnecessarily large alignment. But ignoring the alignment in MachineFrameInfo also seems bad. Looking at the test update that go with the current ID==0 check, it was intending to exclude things like the NoAlloc stackid. So I'm not sure if scalable vectors are intentionally being excluded. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135913	2022-10-20 14:05:46 -07:00
Yeting Kuo	55ae180a4c	[VP] Teach isVPBinaryOp to recognize vp.smin/smax/umin/umax/minnum/maxnum. Those vp intrinsics should be vp binary operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135753	2022-10-20 07:21:13 +08:00
Craig Topper	061566d954	[RISCV] Remove -enable-unsafe-fp-math from machine combiner tests. NFC The optimization is using fast math flags on the instructions instead.	2022-10-19 15:54:33 -07:00
Craig Topper	e0afb72e82	[RISCV] Add more check prefixes to extractelt-fp.ll to fix a conflicting case. The existing prefix conflicted and the script silently dropped the checks.	2022-10-19 12:12:39 -07:00
luxufan	82c820b95c	[RISCV] Enable the LocalStackSlotAllocation pass support For RISC-V, load/store(exclude vector load/store) instructions only has a 12 bit immediate operand. If the offset is out-of-range, it must make use of a temp register to make up this offset. If between these offsets, they have a small(IsInt<12>) relative offset, LocalStackSlotAllocation pass can find a value as frame base register's value, and replace the origin offset with this register's value plus the relative offset. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98101	2022-10-19 16:15:14 +08:00
Han-Kuan Chen	615af94dc2	[RISCV] Lower VECTOR_SHUFFLE to VSLIDEDOWN_VL. Differential Revision: https://reviews.llvm.org/D136136	2022-10-18 08:58:39 -07:00
Han-Kuan Chen	8d0246a926	[RISCV] Pre-commit tests for lowering VECTOR_SHUFFLE to VSLIDEDOWN_VL. Differential Revision: https://reviews.llvm.org/D136135	2022-10-18 08:58:38 -07:00
Anton Sidorenko	1978b4d968	[MachineCombiner][RISCV] Enable MachineCombiner for RISCV Initial implementation to match basic FP reassociation patterns. Differential Revision: https://reviews.llvm.org/D135264	2022-10-18 18:56:32 +03:00
Anton Afanasyev	e175f99c49	Revert "[MachineCombiner][RISCV] Enable MachineCombiner for RISCV" This reverts commit 3112cf3b00fe45a0911ec0c2e6706ef1f8a9b972. Test breakage: https://lab.llvm.org/buildbot/#/builders/16/builds/36631	2022-10-18 15:57:11 +03:00
Anton Sidorenko	3112cf3b00	[MachineCombiner][RISCV] Enable MachineCombiner for RISCV Initial implementation to match basic FP reassociation patterns. Differential Revision: https://reviews.llvm.org/D135264	2022-10-18 15:31:03 +03:00
Anton Sidorenko	bd6bf3499f	[MachineCombiner][RISCV] Precommit test for D135264	2022-10-18 12:53:07 +03:00
LiaoChunyu	7b970290c0	[RISCV] Optimize SELECT_CC when the true value of select is Constant (select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant) This patch removes unnecessary copies Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D129757	2022-10-18 09:24:17 +08:00
Craig Topper	2b32e4f98b	[RISCV] Add basic support for the sifive-7-series short forward branch optimization. sifive-7-series has macrofusion support to convert a branch over a single instruction into a conditional instruction. This can be an improvement if the branch is hard to predict. This patch adds support for the most basic case, a branch over a move instruction. This is implemented as a pseudo instruction so we can hide the control flow until all code motion passes complete. I've disabled a recent select optimization if this feature is enabled in the subtarget. Related gcc patch for the same optimization https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg211045.html Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135814	2022-10-17 13:56:22 -07:00
Craig Topper	30305d7948	[TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF. The code incorrectly checked for CTLZ_ZERO_UNDEF instead of CTTZ_ZERO_UNDEF. While I was there I flipped the condition into an early out. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D136010	2022-10-17 10:15:39 -07:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Philip Reames	d91b0d6816	[RISCV] Merge rv32 and rv64 fixed vector stepvector tests	2022-10-14 14:54:37 -07:00
Craig Topper	e68b0d5875	[RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0). There's a DAGCombine after we turn the select into select_cc, but that may introduce a setcc that didn't previously exist. We could add more DAGCombines to remove the extra setcc, but this seemed lower effort. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135833	2022-10-13 09:06:12 -07:00
Philip Reames	1c41d0cb62	[RISCV] Use branchless form for selects with 0 in either arm Continuing the theme of adding branchless lowerings for simple selects, this time handle the 0 arm case. This is very common for various umin idioms, etc.. Differential Revision: https://reviews.llvm.org/D135600	2022-10-12 13:51:52 -07:00
Yeting Kuo	2749b942e9	[RISCV] Add isel patterns for vmacc, vnmsac. The patch selects VSELECT/VP_MERGE_VL which uses fmadd/fnmsub as true operand and the adden of the fmadd/fnmsub as false operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135330	2022-10-12 09:19:01 +08:00
Craig Topper	1bdf21d55c	[RISCV] Use mask/tail agnostic if tied source is IMPLICIT_DEF regardless of the policy operand. If the source is implicit_def, the register allocator won't have any constraint on what register it picks for the destination. This doesn't give the user much control of what register is being used. So in my mind that means the only reason to honor the policy operand is to control what policy is used in vsetvli to maybe avoid a vtype change. Given the other optimizations we do on the policy field, I don't think allowing the user this control is reliable. Therefore, I think we should use agnostic policies if the source is undef. This should give better performance on some CPUs for VP intrinsics where there is no merge operand and the backend adds IMPLICIT_DEF to the instruction. Differential Revision: https://reviews.llvm.org/D135396	2022-10-11 16:40:16 -07:00
Craig Topper	ac9209751a	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit 0148df8157f05ecf3b1064508e6f012aefb87dad. Getting a lit test failures on AMDGPU but I can't reproduce it so far. Reverting to investigate.	2022-10-11 16:30:40 -07:00
Craig Topper	0148df8157	[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y)) (sra X, BW-1) is either 0 or -1. So the multiply is a conditional negate of Y. This pattern shows up when type legalizing wide multiplies involving a sign extended value. Fixes PR57549. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133399	2022-10-11 16:20:55 -07:00
Craig Topper	0121b1a4ac	Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit d4facda414b6b9b8b1a34bc7e6b7c15172775318. This has been reported to cause failures. Reverting while I investigate.	2022-10-10 14:53:29 -07:00
Craig Topper	d4facda414	[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant. If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135541	2022-10-10 11:02:22 -07:00
LiaoChunyu	a835b92e6c	[RISCV] Use hasAllWUsers to recover XORI/ORI reference 0fbe71e91f44. Also add testcase for addi. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135538	2022-10-10 14:16:50 +08:00
Craig Topper	b0c2f90453	[RISCV] Merge more rv32/rv64 vector intrinsic tests that contain the same content.	2022-10-08 18:30:40 -07:00
Craig Topper	39532ea073	[RISCV] Add signext attribute to i32 arguments in some tests. NFC	2022-10-08 10:50:16 -07:00
Craig Topper	9f67047cf0	[VP][RISCV] Add vp.smax/smin/umax/umin intrinsics Differential Revision: https://reviews.llvm.org/D135418	2022-10-07 17:14:31 -07:00
eopXD	dbc681c98e	[VP][RISCV] Add vp.roundtozero and its RISC-V support The scalar instruction of this is `llvm.trunc`. However the naming of ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as `vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding `vp.roundtozero` that will look similar to `vp.roundeven`. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D135233	2022-10-07 02:15:23 -07:00
Craig Topper	3b20765cf7	[RISCV] Use mask agnostic policy for isel patterns where the merge operand is IMPLICIT_DEF. I tend to think we should ignore the policy bit in vsetvli insertion if the tied operand is IMPLICIT_DEF. But that raises questions about what the policy operand on RVV intrinsics means if you also pass vundefined(). This change at least fixes some cases. I'll post a separate patch for vsetvli insertion for discussion. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135386	2022-10-06 15:44:39 -07:00
Philip Reames	79f0413e5e	[RISCV] Use branchless form for selects with -1 in either arm We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests). Differential Revision: https://reviews.llvm.org/D135316	2022-10-06 15:18:43 -07:00
Craig Topper	dc2b8fb965	[RISCV] Use fixed vector types in fixed-vectors-vfnmsac-vp.ll. NFC	2022-10-06 11:02:13 -07:00
Craig Topper	3d6c63d413	[RISCV] Cleanup some vector tests. NFC Some tests had scalable vector intrinsic names with fixed vector types. Some had types in the wrong order. Remove scalable vector test from fixed vector files. Also replace insert+shuffle constexprs with fixed constant vectors.	2022-10-06 10:51:39 -07:00
Ivan Tetyushkin	0e6c1576e6	[RISCV] Optimization for using compressed beqz and bnez PR#56391 Optimization for using compressed beqz and bnez If there is pattern ``` br_cc val1 constval eq/neq place select_cc val1 constval eq/neq trueval falseval ``` and constval does not fit in compressed imm format(6 bit), but fit in imm format(12 bit), we can replace by non compress sub and compress c.beqz/c.bneqz: ``` addi val val -constval c.beqz val place ``` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132839	2022-10-06 09:33:32 -07:00

1 2 3 4 5 ...

2060 Commits