llvm-project

Author	SHA1	Message	Date
Yeting Kuo	cdc0392669	[RISCV] Update implies for subtarget feature. (#75824 ) PR #75576 and #75735 update some implies in llvm/lib/Support/RISCVISAInfo.cpp, but both of them miss the subtarget feature part. This patch still preserve predicate HasStdExtZfhOrZfhmin and HasStdExtZhinxOrZhinxmin, since they could make error message more readable. ( Users might not know that zfh implies zfhmin.)	2023-12-19 09:47:46 +08:00
Kazu Hirata	96d0a3b564	[llvm] Stop including optional (NFC) Identified with clangd.	2023-12-02 00:31:07 -08:00
Philip Reames	129440728c	[RISCV] Partially move doPeepholeMaskedRVV into RISCVFoldMasks (#72441 ) This change is motived by a point of confusion on https://github.com/llvm/llvm-project/pull/71764. I hadn't fully understood why doPeepholeMaskedRVV needed to be part of the same change. As indicated in the fixme in this patch, the reason is that performCombineVMergeAndVOps doesn't know how to deal with the true side of the merge being a all-ones masked instruction. This change removes one of two calls to the routine in RISCVISELDAGToDAG, and adds a clarifying comment on the precondition for the remaining call. The post-ISEL code is tested by the cases where we can form a unmasked instruction after folding the vmerge back into true. I don't really care if we actually land this patch, or leave it roled into https://github.com/llvm/llvm-project/pull/71764. I'm posting it mostly to clarify the confusion.	2023-11-27 08:33:03 -08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Nemanja Ivanovic	563720c3be	[RISCV] Fix lowering of negative zero with Zdinx 32-bit (#71869 ) The compiler currently abends with an impossible reg-to-reg copy when producing a negative zero FP immediate on RV32 with -Zdinx. This is because we emit a negation that uses FP registers. Emit the right node to produce correct code.	2023-11-13 07:38:14 +01:00
Craig Topper	a93dfb589d	[RISCV] Peek through zext in selectShiftMask. This improves the code for -riscv-experimental-rv64-legal-i32	2023-11-10 19:02:14 -08:00
Craig Topper	aae30f9e2c	[RISCV] Use Align(8) for the stack temporary created for SPLAT_VECTOR_SPLIT_I64_VL. The value needs to be read as an 8 byte vector element which requires the pointer to be 8 byte aligned according to the vector spec. Fixes #71787	2023-11-09 20:43:22 -08:00
Wang Pengcheng	e179b125fb	[RISCV][NFC] Pass MCSubtargetInfo instead of FeatureBitset in RISCVMatInt (#71770 ) The use of `hasFeature` is more descriptive and the callers of `RISCVMatInt` have no need to call `getFeatureBits()` any more.	2023-11-09 15:15:23 +08:00
Yeting Kuo	a5c1ecada2	[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. (#71483 ) This transformation might be illegal for `PseduoVIOTA_M`. The value of `viota.m vd, vs2` is the prefix sum of vd2 and adding mask for it may cause wrong prefix sum. Take an example, the result of following expression is `{5, 5, 5, 3}`, ``` ; v4 = {1, 1, 1, 1} viota.m v1, v4 ; v0 = {0, 0, 0, 1}, v1 = {0, 1, 2, 3}, v8 = {5, 5, 5, 5} vmerge.vvm v8, v8, v1, v0.t ; v8 = {5, 5, 5, 3} ``` but if we merge them to `viota.m v8, v4, v0.t`, then the result of is `{5, 5, 5, 0}`. Also, we still does `performCombineVMergeAndVOps` for `voita.m` when mask of `vmerge.vvm` is a true mask.	2023-11-07 16:21:35 +08:00
Craig Topper	8912200966	[RISCV] Add experimental support for making i32 a legal type on RV64 in SelectionDAG. (#70357 ) This will select i32 operations directly to W instructions without custom nodes. Hopefully this can allow us to be less dependent on hasAllNBitUsers to recover i32 operations in RISCVISelDAGToDAG.cpp. This support is enabled with a command line option that is off by default. Generated code is still not optimal. I've duplicated many test cases for this, but its not complete. Enabling this runs all existing lit tests without crashing.	2023-11-01 09:36:41 -07:00
Min-Yih Hsu	87f671756d	[RISCV] Use FLI + FNEG to materialize some negative FP constants (#70825 ) Most of the FP constants supported by FLI are positive. For negative FP constants X whose positive values is supported by FLI, we can use `(FNEG (FLI -X))` to materialize X.	2023-10-31 17:52:50 -07:00
Luke Lau	72e6c1c70d	[RISCV] Begin moving post-isel vector peepholes to a MF pass (#70342 ) We currently have three postprocess peephole optimisations for vector pseudos: 1) Masked pseudo with all ones mask -> unmasked pseudo 2) Merge vmerge pseudo into operand pseudo's mask 3) vmerge pseudo with all ones mask -> vmv.v.v pseudo This patch aims to move these peepholes out of SelectionDAG and into a separate RISCVFoldMasks MachineFunction pass. There are a few motivations for doing this: * The current SelectionDAG implementation operates on MachineSDNodes, which are essentially MachineInstrs but require a bunch of logic to reason about chain and glue operands. The RISCVII::hasOp helper functions also don't exactly line up with the SDNode operands. Mutating these pseudos and their operands in place becomes a good bit easier at the MachineInstr level. For example, we would no longer need to check for cycles in the DAG during performCombineVMergeAndVOps. Although it's further down the line, moving this code out of SelectionDAG allows it to be reused by GlobalISel later on. * In performCombineVMergeAndVOps, it may be possible to commute the operands to enable folding in more cases (see test/CodeGen/RISCV/rvv/vmadd-vp.ll). There is existing machinery to commute operands in TII::commuteInstruction, but it's implemented on MachineInstrs. The pass runs straight after ISel, before any of the other machine SSA optimization passes run. This is so that dead-mi-elimination can mop up any vmsets that are no longer used (but if preferred we could try and erase them from inside RISCVFoldMasks itself). This also means that these peepholes are no longer run at codegen -O0, so this patch isn't strictly NFC. Only the performVMergeToVMv peephole is refactored in this patch, the remaining two would be implemented later. And as noted by @preames, it should be possible to move doPeepholeSExtW out of SelectionDAG as well.	2023-10-30 15:17:00 +00:00
Wang Pengcheng	a316f14fdd	[RISCV][NFC] Move getRVVMCOpcode to RISCVInstrInfo (#70637 ) To simplify more code.	2023-10-30 19:03:04 +08:00
Craig Topper	109aa586f0	[RISCV] Add an experimental pseudoinstruction to represent a rematerializable constant materialization sequence. (#69983 ) Rematerialization during register allocation is currently limited to a single instruction with no inputs. This patch introduces a pseudoinstruction that represents the materialization of a constant. I've started with a sequence of 2 instructions for now, which covers at least the common LUI+ADDI(W) case. This instruction will be expanded into real instructions immediately after register allocation using a new pass. This gives the post-RA scheduler a chance to separate the 2 instructions to improve ILP. I believe this matches the approach used by AArch64. Unfortunately, this loses some CSE opportunies when an LUI value is used by multiple constants with different LSBs. This feature is off by default and a new backend command line option is added to enable it for testing. This avoids the spill and reloads reported in #69586.	2023-10-25 17:20:32 -07:00
Craig Topper	51446d945a	[RISCV] Only check for scalar VT at depth 0 in hasAllNBitUsers. VTs on already selected instructions can be arbitrary. Reviewing the isel table I see i32 used for instructions that are part of multiple instruction output patterns. Looks like tblgen to just picks the lowest numbered MVT that is legal for the destination register class of the instruction. Seems better to just not check types for already selected nodes.	2023-10-23 15:05:38 -07:00
Craig Topper	0b3f6ff3c4	[RISCV] Disable hasAllNBitUsers for vector types. RISCVGenDAGISel.inc can call this before it checks the node type. Ensure the type is scalar before wasting time to do the more computationally expensive checks. This also avoids an assertion if we hit a VMV_X_S instruction which doesn't have a VL operand which vectorPseudoHasAllNBitUsers expects.	2023-10-22 21:59:31 -07:00
Wang Pengcheng	f24d9490e5	[RISCV] Match prefetch address with offset (#66072 ) A new ComplexPattern `AddrRegImmLsb00000` is added, which is like `AddrRegImm` except that if the least significant 5 bits isn't all zeros, we will fail back to offset 0.	2023-10-20 14:22:48 +08:00
Shao-Ce SUN	f48dab5237	Add RV64 constraint to SRLIW (#69416 ) Fixes #69408	2023-10-18 15:01:17 +08:00
Luke Lau	e577e7025d	[RISCV] Move vector pseudo hasAllNBitUsers switch into RISCVInstrInfo. NFC (#67593 ) The handling for vector pseudos in hasAllNBitUsers is duplicated across RISCVISelDAGToDAG and RISCVOptWInstrs. This deduplicates it between the two, with the common denominator between the two call sites being the opcode and SEW: We need to handle extracting these separately since one operates at the SelectionDAG level and the other at the MachineInstr level.	2023-10-03 12:24:11 +01:00
Craig Topper	3c0990c188	[RISCV] Generalize the (ADD (SLLI X, 32), X) special case in constant materialization. (#66931 ) We don't have to limit ourselves to a shift amount of 32. We can support other shift amounts that make the upper 32 bits line up.	2023-10-02 13:03:06 -07:00
Alex Bradbury	0b0ed8f76a	[RISCV] Add missing hunk to #67889 to fix test failures Without this, various CodeGen tests fail because a RISCV::FCVT_D_W[_IN32X] machine node is created without the rounding mode operand. The relevant PR was committed as bf94ba39b65d1212ea84d5783b393280e1ce7478	2023-10-01 11:34:57 +01:00
Craig Topper	e6b2525daf	[RISCV] Fix -Wsign-compare warning. NFC	2023-09-27 13:41:06 -07:00
Luke Lau	5ffbdd9ed5	[RISCV] Handle .vx pseudos in hasAllNBitUsers (#67419 ) Vector pseudos with scalar operands only use the lower SEW bits (or less in the case of shifts and clips). This patch accounts for this in hasAllNBitUsers for both SDNodes in RISCVISelDAGToDAG. We also need to handle this in RISCVOptWInstrs otherwise we introduce slliw instructions that are less compressible than their original slli counterpart. This is a reland of aff6ffc8760b99cc3d66dd6e251a4f90040c0ab9 with the refactoring omitted.	2023-09-27 19:53:50 +01:00
Philip Reames	487dd5f1e3	Revert "[RISCV] Handle .vx pseudos in hasAllNBitUsers (#67419 )" This reverts commit aff6ffc8760b99cc3d66dd6e251a4f90040c0ab9. Version landed differs from version reviewed in (stylistic) manner worthy of separate review.	2023-09-27 11:24:49 -07:00
Luke Lau	aff6ffc876	[RISCV] Handle .vx pseudos in hasAllNBitUsers (#67419 ) Vector pseudos with scalar operands only use the lower SEW bits (or less in the case of shifts and clips). This patch accounts for this in hasAllNBitUsers for both SDNodes in RISCVISelDAGToDAG. We also need to handle this in RISCVOptWInstrs otherwise we introduce slliw instructions that are less compressible than their original slli counterpart.	2023-09-27 18:12:29 +01:00
Craig Topper	65eb46877c	[RISCV] Explicitly create IMPLICIT_DEF instead of UNDEF for vectors i… (#67369 ) …n RISCVDAGToDAGISel::Select. UNDEF needs to go through isel itself. All of the nodes have been topologically sorted so that instruction selection precedes from root to entry node. If we create a new node that needs to go through isel, we have to insert it into the correct place in the topological sort. If we don't, it might not get selected at all in some cases. Some targets have a function like X86's insertDAGNode to sort newly created nodes. To avoid introducing such a function on RISC-V, we can directly emit the IMPLICIT_DEF node that UNDEF would get selected to.	2023-09-26 08:37:24 -07:00
Philip Reames	233b6ef66c	[RISCV] Handle EltType > XLEN case in VMV_V_X_VL to VMV_S_X_VL fold I'd guarded this case in D158874 to avoid regressions, and decided to go investigate what was going on. The solution turns out to be a generic splat matching extension to handle INSERT_SUBVECTOR. In theory, we could see these from other sources as well, but for some reason we only seem to see the i64 extract on rv32 case in practice. Not sure why that is to be honest. Differential Revision: https://reviews.llvm.org/D159230	2023-09-22 13:43:43 -07:00
Luke Lau	3510552df6	[RISCV] Check for COPY_TO_REGCLASS in usesAllOnesMask (#67037 ) Sometimes with mask vectors that have been widened, there is a CopyToRegClass node in between the VMSET and the CopyToReg. This is a resurrection of https://reviews.llvm.org/D148524, and is needed to remove the mask operand when it's extracted from a subvector as planned in https://github.com/llvm/llvm-project/pull/66267#discussion_r1331998919	2023-09-22 16:30:43 +01:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Reid Kleckner	ee643b706b	Revert "[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 )" This reverts commit 2ca4d136124d151216aac77a0403dcb5c5835bcd. Also revert the followup, "[InlineAsm] fix botched merge conflict resolution" This reverts commit 8b9bf3a9f715ee5dce96eb1194441850c3663da1. There were SystemZ and Mips build errors, too many to fix forward.	2023-09-13 09:58:02 -07:00
Nick Desaulniers	2ca4d13612	[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 ) Similar to commit 2fad6e69851e ("[InlineAsm] wrap Kind in enum class NFC") Fix the TODOs added in commit 93bd428742f9 ("[InlineAsm] refactor InlineAsm class NFC (#65649)")	2023-09-13 08:48:09 -07:00
Craig Topper	319aba645f	[RISCV] Teach MatInt to use (ADD_UW X, (SLLI X, 32)) to materialize some constants. If the high and low 32 bits are the same, we try to use (ADD X, (SLLI X, 32)) but that only works if bit 31 is clear since the low 32 bits will be sign extended. If we have Zba we can use add.uw to zero the sign extended bits. Reviewed By: reames, wangpc Differential Revision: https://reviews.llvm.org/D159253	2023-08-31 20:24:34 -07:00
Philip Reames	079c968eb9	[RISCV] Form vmv.s.f/x from single element splats via DAG combine This re-implements the special casing we had in lowerScalarSplat as a DAG combine. As can be seen in the tests, this ends up triggering in a bunch more cases. The semantically interesting bit of this change is the use of the implicit truncate semantics for when XLEN > SEW. We'd already been doing this for vmv.v.x, but this change extends e.g. the constant matching to make the same assumption about vmv.s.x. Per my reading of the specification, this should be fine, and if anything, is more obviously true of vmv.s.x than vmv.v.x. Differential Revision: https://reviews.llvm.org/D158874	2023-08-30 12:44:36 -07:00
Luke Lau	18c7bf0b85	[RISCV] Refactor selectVSplat. NFCI This patch shares the logic between the various splat ComplexPatterns to help the diff in some upcoming patches. It's worth noting that the uimm splat pattern now takes into account the implicit truncation + sign extend semantics of vmv_v_x_vl, but that doesn't seem to affect the result since it always took the sext value anyway. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158741	2023-08-29 15:42:23 +01:00
Luke Lau	007b41b393	[RISCV] Don't relax policy to ta when vmerge's VL shrinks during folding When folding a vmerge into its operands, if the resulting VL is smaller than what the vmerge had originally then what was previously in its body then gets moved to the tail. In that case, we can't relax the tail policy to agnostic when the merge operand is undefined, since we need to preserve these elements past the new VL. Fixes https://github.com/llvm/llvm-project/issues/64754 Reviewed By: craig.topper, reames Differential Revision: https://reviews.llvm.org/D158161	2023-08-22 10:39:22 +01:00
Philip Reames	a63bd7e99b	[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG does CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282. This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers. We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility. Differential Revision: https://reviews.llvm.org/D156909	2023-08-14 12:57:38 -07:00
Craig Topper	a8c502a589	[RISCV] Add bf16 to isFPImmLegal. Part of this test file was stolen from D156895. We should merge them when committing. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D156926	2023-08-03 08:27:38 -07:00
Craig Topper	de7fa3ab9a	[RISCV] Copy memoperands in some of the post isel peepholes. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D156830	2023-08-02 09:16:14 -07:00
Jianjian GUAN	b7408ebbb7	[RISCV] Use x0 in vsetvli when avl is equal to vlmax. We could use x0 form in vsetvli when we already know the vlmax and avl is equal to it. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156404	2023-07-31 09:49:40 +08:00
Luke Lau	ce8f094da8	[RISCV] Add patterns for vnsrl.vx where shift amount is truncated Similar to D155698 where the shift amount is extended, this patch extends the ComplexPattern to handle the case where the shift amount has been truncated. Truncations are custom lowered to truncate_vector_vl, and in cases like i64 -> i16 they are truncated by one power of two at a time, so we need to unravel nested layers of them. The pattern can also be reused for Zvbb's vwsll.vx in an upcoming patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155928	2023-07-26 20:26:32 +01:00
Luke Lau	33a83c5486	[RISCV] Add SDNode patterns for vrol.[vv,vx] and vror.[vv,vx,vi] These correspond to ROTL/ROTR nodes Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155439	2023-07-21 10:22:46 +01:00
Luke Lau	24628a14c4	[RISCV] Add patterns for vnsr[a,l].wx where shift amount has different type than vector element We're currently only matching scalar shift amounts where the type is the same as the vector element type. But because only the bottom log2(2*SEW) bits are used, only 7 bits will be used at most so we can use any scalar type >= i8. This patch adds patterns for the case above, as well as for when the shift amount type is the same as the widened element type and doesn't need extended. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155698	2023-07-21 10:13:28 +01:00
Simon Pilgrim	73f09814ee	Fix MSVC "'GetVMSetForLMul': not all control paths return a value" warning. NFC.	2023-07-19 18:55:37 +01:00
Luke Lau	efedcbeeb8	[RISCV] Fold ops into vmv.v.v as vmerge with all-ones mask A vmv.v.v shares the same encoding as a vmerge that isn't masked, so we can also fold it into its operands if we treat it as a vmerge with an all-ones mask. We take care here not to actually transform the existing vmv into a vmerge, otherwise things like True.hasOneUse() become inaccurate. Instead this just returns an equivalent list of operands. This is an alternative to D153351. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D155101	2023-07-19 17:24:42 +01:00
Luke Lau	0f277ab361	[RISCV] Fold vmerge into its ops with smaller VL if known Currently when folding vmerge into its operands, we stop if the VLs aren't identical. However since the body of (vmerge (vop)) is the intersection of vmerge and vop's bodies, we can use the smaller of the two VLs if we know it ahead of time. This patch relaxes the constraint on VL if they are both constants, or if either of them are VLMAX. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D155071	2023-07-19 17:24:40 +01:00
Philip Reames	8e024283bd	[RISCV] Minor style cleanups in post ISEL combines	2023-07-18 12:26:36 -07:00
Piyou Chen	7ce4e933ea	[RISCV] Implement prefetch locality by NTLH We add the MemOperand then backend will generate NTLH automatically. ``` __builtin_prefetch(ptr, 0 /* rw==read /, 0 / locality /); => ntl.all + prefetch.r (ptr) __builtin_prefetch(ptr, 0 / rw==read /, 1 / locality /); => ntl.pall + prefetch.r (ptr) __builtin_prefetch(ptr, 0 / rw==read /, 2 / locality /); => ntl.p1 + prefetch.r (ptr) __builtin_prefetch(ptr, 0 / rw==read /, 3 / locality */); => prefetch.r (ptr) ``` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D154691	2023-07-16 20:32:46 -07:00
Craig Topper	d09109aa1e	[RISCV] Use isScalarInteger instead of isInteger. NFC The type should only be scalar here and the isScalarInteger should be a simpler check.	2023-07-15 22:52:43 -07:00
Philip Reames	b8e29dbe54	[RISCV] Common remaining operand logic in performCombineVMergeAndVOps [nfc] We can share the code for both the unmasked and masked cases, and add a missing consistency assert in the process. This is a subset of Luke's D155063. I'm splitting pieces and landing them in the process of convincing myself all the individual transforms are in fact correct. This is the last major piece.	2023-07-13 11:27:16 -07:00

1 2 3 4 5 ...

401 Commits