llvm-project

Author	SHA1	Message	Date
Eric Biggers	09058654f6	[RISCV] Remove experimental from Vector Crypto extensions (#74213 ) The RISC-V vector crypto extensions have been ratified. This patch updates the Clang and LLVM support for these extensions to be non-experimental, while leaving the C intrinsics as experimental since the C intrinsics are not yet standardized. Co-authored-by: Brandon Wu <brandon.wu@sifive.com>	2023-12-18 22:04:22 -08:00
Philip Reames	8624075105	[RISCV] Strip W suffix from ADDIW (#68425 ) The motivation of this change is simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ only due to the use of addi on rv32 vs addiw on rv64 when the high bits are don't care. As an aside, we don't need to worry about the non-zero immediate restriction on the compressed variants because we're not directly forming the compressed variants. If we happen to get a zero immediate for the ADDI, then either a later optimization will strip the useless instruction or the encoder is responsible for not compressing the instruction.	2023-10-06 10:28:01 -07:00
Jay Foad	e0919b189b	[CodeGen] Renumber slot indexes before register allocation (#66334 ) RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.	2023-09-19 11:18:12 +01:00
Craig Topper	398c855457	[RISCV] Improve splatPartsI64WithVL for vlmax scalable vector constants where Hi and Lo are the same. We can use a 32-bit splat and bitcast to i64 vector. This only handles the case where we are using vlmax so that the new vl is cheap to compute. This could be generalized to double the VL. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158879	2023-08-25 14:15:41 -07:00
Philip Reames	a63bd7e99b	[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG does CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282. This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers. We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility. Differential Revision: https://reviews.llvm.org/D156909	2023-08-14 12:57:38 -07:00
Simon Pilgrim	ae60706da0	[DAG] SimplifyDemandedBits - call ComputeKnownBits for constant non-uniform ISD::SRL shift amounts We only attempted to determine KnownBits for uniform constant shift amounts, but ComputeKnownBits is able to handle some non-uniform cases as well that we can use as a fallback.	2023-07-21 14:52:57 +01:00
Luke Lau	55e2772e9f	[RISCV] Add initial SDNode patterns for unary zvbb instructions This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v, vctz.v and vcpop.v. I've only added them for integer element types so far since we're lacking tests for floats. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155216	2023-07-13 19:39:04 +01:00
Philip Reames	403261eafd	[RISCV] Remove legacy TA/TU pseudo distinction for load instructions This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. This change targets all the pseudos used in loads (unit, strided, segmented, fault first, and their combinations). As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand. One quirk is that I went ahead and treated the unmasked mask load instruction (vlm) the same way. We need the pass thru operand to model tail undefined, but since the instruction is unconditionally agnostic and the instruction has no mask, the policy operand is arguably unneeded. I kept it mostly for consistency sake. Another quirk worth highlighting is that segment loads require a bit of dedicated handling. Surprisingly, we don't have IMPLICIT_DEF nodes of the right types, and attempting to use them results in some odd looking codegen and a few crashes. Instead, I left the REG_SEQUENCE form, and extended InsertVSETVLI to recognize the complex undefs. Arguably, we should probably revisit the handling of undef reg_sequence nodes here, but I'm hoping to side step that in this patch. As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions. I did have to delete one register allocation regression test as I couldn't figure out how to meaningfully update it. I spent a significant amount of time trying, and finally gave up. Differential Revision: https://reviews.llvm.org/D154141	2023-07-05 13:11:58 -07:00
Yunze Zhu	9d22b54d6b	[RISCV] Use temporary stack in expanding SPLAT_VECTOR_SPLIT_I64_VL node There is an issue: https://github.com/llvm/llvm-project/issues/63515 The issue is because when expanding SPLAT_VECTOR_SPLIT_I64_VL node, only memoperand is used to create dependency. However in ScheduleDAGNodes, dependency is checked with chain only, and breaks order of store/load instructions. I think in llvm.bitreverse.nxv2i64 intrinsic SPLAT_VECTOR_SPLIT_I64_VL nodes are parallel processed, so no chain should be add to these nodes. Using temporary in expanding SPLAT_VECTOR_SPLIT_I64_VL node can keep vlse instruction get correct value no matter order of store instructions is changed. Differential Revision: https://reviews.llvm.org/D153743	2023-06-29 16:45:16 +08:00
Philip Reames	95697deff3	[RISCV] Make all vector binops use the _TU pseudo form This continues towards the goal spelled out in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. This patch switches all the binary operations (no widen, no narrow, but both int and FP) to use the _TU + implicit_def passthrough form. Change is mechanical. This only changes the unmasked variants. Masked variants will still go through doPeepholeMaskedRVV and end up in the unsuffixed/TA form. Fixing that will be a separate change. Differential Revision: https://reviews.llvm.org/D152940	2023-06-16 16:28:19 -07:00
Florian Mayer	38f7c7eb1a	Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)."" Revert broke even more stuff. This reverts commit d5fbec30939f2c9f82475cf42c638619514b5c67.	2023-06-06 17:39:05 -07:00
Florian Mayer	d5fbec3093	Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)." Triggers UBSan error. This reverts commit 58b2d652af49ee9d9ff2af6edd7f67f23b26bfee.	2023-06-06 17:30:07 -07:00
Craig Topper	58b2d652af	[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C). Where C is a simm32. This costs an extra temporary register, but avoids a constant pool. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D152236	2023-06-06 11:59:12 -07:00
Craig Topper	f8c681227f	[RISCV] Enable the Machine Late Cleanup pass. Believe the bug has been fixed with D139169 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D139753	2022-12-11 20:55:05 -08:00
Craig Topper	f1fd5c9b36	[RISCV] Remove pseudos for whole register load, store, and move. The MC layer instructions have the correct register classes, and the pseudos don't have any additional operands. So there doesn't seem to be any reason for them to exist. The pseudos were incorrectly going through code in RISCVMCInstLower that converted LMUL>1 register classes to LMUL1 register class. This makes the MCInst technically malformed, and prevented the vl2r.v, vl4r.v, and vl8r.v InstAliases from matching. This accounts for all of the .ll test diffs. Differential Revision: https://reviews.llvm.org/D139511	2022-12-07 10:19:58 -08:00
Sergey Kachkov	132dc442ba	[RISCV] Generate .cfi_def_cfa_expression for RVV stack adjustment Cannonical frame address after RVV stack adjustment is sp + StackSize + RVVStackSize * vlenb, and since vlenb is unknown at compile-time (but it is a constant for particular HW implementation), emit .cfi_def_cfa_expression so libunwind can read VLENB CSR register at run-time and obtain correct frame address. Fixes https://github.com/llvm/llvm-project/issues/58356 (but additional run-time support for reading CSR may be required) Differential Revision: https://reviews.llvm.org/D136263	2022-12-06 12:45:59 +03:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit 6d12599fd4134c1da63198c74a25490d28c733f6.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Craig Topper	f387918dd8	[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion. We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was done to bitreverse previously. Differential Revision: https://reviews.llvm.org/D138045	2022-11-15 14:36:01 -08:00
Philip Reames	d89d45ca9a	[RISCV][InsertVSETVLI] Default to MA not MU This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn. The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two. Differential Revision: https://reviews.llvm.org/D133803	2022-10-06 07:59:39 -07:00
luxufan	c06d0b4d02	[RISCV] Add ADDI instr for computing FrameIndex address RVV doesn't have immediate field for memory addressing. Currently we build MachineInstructions in PEI to computing stack offset for RVV load store instructions. These instructions were added too late to can be optimized by CSE, LICM... passes. This patch makes FrameIndex SDNodes can't be matched in RVV Load Store instruction selection patterns. So that the FrameIndex SDNodes would be selected as `ADDI GPR, targetframeindex`. There are 2 advantages for such change: 1. Stack objects address computing can be optimized by machine function passes. 2. Since the ADDI instruction's destination register can be used as a temp register, we can save an emergency spill slot. Differential Revision: https://reviews.llvm.org/D128187	2022-07-04 22:13:35 +08:00
eopXD	3cf15af2da	[RISCV] Remove experimental prefix from rvv-related extensions. Extensions affected: +v, +zve, +zvl Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117860	2022-01-22 20:18:40 -08:00
wangpc	41454ab256	[RISCV] Use constant pool for large integers For large integers (for example, magic numbers generated by TargetLowering::BuildSDIV when dividing by constant), we may need about 4~8 instructions to build them. In the same time, it just takes two instructions to load constants (with extra cycles to access memory), so it may be profitable to put these integers into constant pool. Reviewed By: asb, craig.topper Differential Revision: https://reviews.llvm.org/D114950	2021-12-31 14:48:48 +08:00
wangpc	af0ecfccae	[RISCV] Generate pseudo instruction li Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by running script `llvm/utils/update_llc_test_checks.py`. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D112692	2021-11-22 14:01:37 +08:00
Craig Topper	ada5458521	[RISCV] Expand scalable vector bswap. Fix crash for bitreverse. Fix LegalizeVectorOps to not try shuffle or unrolling expansions for scalable vectors. Differential Revision: https://reviews.llvm.org/D112236	2021-10-31 10:01:27 -07:00

25 Commits