llvm-project

Author	SHA1	Message	Date
Jesse Huang	e5ad7f4556	[RISCV] Move RISCVIndirectBranchTracking before Branch Relaxation (#139993 ) The `RISCVIndirectBranchTracking` pass inserts `lpad` instruction and could change the basic block alignment, so this should not happen after the branch relaxation as the adjusted offset is possible to exceed the branch range.	2025-06-17 17:21:24 +08:00
Frederik Harwath	6962cf1700	Rename ExpandLargeFpConvertPass to ExpandFpPass (#131128 ) This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR expansion for frem instruction" which implements the expansion of another instruction in this pass. The more general name seems more appropriate given this change and quite reasonable even without it.	2025-03-14 13:11:45 +01:00
Luke Lau	df96b56b9f	[RISCV] Move VMV0 elimination past machine SSA opts (#126850 ) This is the follow up to #125026 that keeps mask operands in virtual register form for as long as possible throughout the backend. The diffs in this patch are from MachineCSE/MachineSink/RISCVVLOptimizer kicking in. The invariant that the mask COPY never has a subreg no longer holds after MachineCSE (it coalesces some copies), so it needed to be relaxed.	2025-02-20 12:41:05 +08:00
Luke Lau	cc7e83601d	[RISCV] Select mask operands as virtual registers and eliminate uses of vmv0 (#125026 ) This is another attempt at #88496 to keep mask operands in SSA after instruction selection. Previously we selected the mask operands into vmv0, a singleton register class with exactly one register, V0. But the register allocator doesn't really support singleton register classes and we ran into errors like "ran out of registers during register allocation in function". This avoids this by introducing a pass just before register allocation that converts any use of vmv0 to a copy to $v0, i.e. what isel currently does today. That way the register allocator doesn't need to deal with the singleton register class, but we get the benefits of having the mask registers in SSA throughout the backend: - This allows RISCVVLOptimizer to reduce the VLs of instructions that define mask registers - It enables CSE and code sinking in more places - It removes the need to peek through mask copies in RISCVISelDAGToDAG and keep track of V0 defs in RISCVVectorPeephole This patch initially eliminates uses of vmv0s after RISCVVectorPeephole to keep the diff to a minimum, and a follow up patch will move it past the other MachineInstr SSA passes. Note that it doesn't try to remove any defs of vmv0 as we shouldn't have any instructions that have any vmv0 outputs. As a further follow up, we can move the elimination pass to after phi elimination and outside of SSA, which would unblock the pre-RA scheduler around masked pseudos. This might also help the issue that RISCVVectorMaskDAGMutation tries to solve.	2025-02-12 12:06:55 +08:00
dlav-sc	97982a8c60	[RISCV][CFI] add function epilogue cfi information (#110810 ) This patch adds CFI instructions in the function epilogue. Before patch: addi sp, s0, -32 ld ra, 24(sp) # 8-byte Folded Reload ld s0, 16(sp) # 8-byte Folded Reload ld s1, 8(sp) # 8-byte Folded Reload addi sp, sp, 32 ret After patch: addi sp, s0, -32 .cfi_def_cfa sp, 32 ld ra, 24(sp) # 8-byte Folded Reload ld s0, 16(sp) # 8-byte Folded Reload ld s1, 8(sp) # 8-byte Folded Reload .cfi_restore ra .cfi_restore s0 .cfi_restore s1 addi sp, sp, 32 .cfi_def_cfa_offset 0 ret This functionality is already present in `riscv-gcc`, but it’s not in `clang` and this slightly impairs the `lldb` debugging experience, e.g. backtrace.	2024-11-06 00:20:21 +03:00
Alex Bradbury	0ee10e9466	[RISCV] Add additional fence for amocas when required by recent ABI change (#101023 ) A recent atomics ABI change / fix requires that for the "A6C" and A6S" atomics ABIs (i.e. both of those supported by LLVM currently), an additional fence is inserted for an atomic_compare_exchange with seq_cst failure ordering. <https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/445> This isn't trivial to support through the hooks used by AtomicExpandPass because that pass assumes that when fences are inserted, the original atomics ordering information can be removed from the instruction. Rather than try to change and complicate that API, this patch implements the needed fence insertion through a small special purpose pass.	2024-09-19 13:39:56 +01:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Yeting Kuo	e80d8e1b42	[RISCV] Insert simple landing pad before indirect jumps for Zicfilp. (#91860 ) This patch is based on https://github.com/llvm/llvm-project/pull/91855. This patch inserts simple landing pad ([pr])before indirct jumps. And this also make option riscv-landing-pad-label influence this feature. [pr]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/417	2024-08-08 13:22:59 +08:00
Yeting Kuo	9fb196b469	[RISCV] Insert simple landing pad for taken address labels. (#91855 ) This patch implements simple landing pad labels ([pr]). When Zicfilp enabled, this patch inserts `lpad 0` at the beginning of basic blocks which are possible to be landed by indirect jumps. This patch also supports option riscv-landing-pad-label to make users cpable to set nonzero fixed labels. Using nonzero fixed label force setting t2 before indirect jumps. It's less portable but more strict than original implementation. [pr]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/417	2024-08-06 22:04:48 +08:00
Alexis Engelke	fa92d51f9e	[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652 ) Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.	2024-08-06 09:27:59 +02:00
Alexis Engelke	b5fc083dc3	[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727 ) Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.	2024-08-01 17:44:32 +02:00
Philip Reames	8756043467	[RISCV] Teach RISCVInsertVSETVLI to work without LiveIntervals (Reapplying with corrected commit message) We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in #93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, we have an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do not neccessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. In practice, this means we pessimize code written with intrinsics at O0. This patch is an alternative to #93796 and #94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.	2024-06-17 12:05:43 -07:00
Philip Reames	1d028151c9	Revert "[RISCV] Teach RISCVInsertVSETVLI to work without LiveIntervals (#94686 )" This reverts commit 111507ed4ce49bbb8cfbf36a3e143bb25f0f13c0. Accidentally landed with stale commit message, will reply shortly.	2024-06-17 12:04:44 -07:00
Philip Reames	111507ed4c	[RISCV] Teach RISCVInsertVSETVLI to work without LiveIntervals (#94686 ) Stacked on https://github.com/llvm/llvm-project/pull/94658. We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in #93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, we have an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do not necessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. This patch is an alternative to https://github.com/llvm/llvm-project/pull/93796 and https://github.com/llvm/llvm-project/pull/94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.	2024-06-17 12:01:51 -07:00
Egor Pasko	cab81dd038	[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171 ) Move EntryExitInstrumenter(PostInlining=true) to as late as possible and EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage (but skip for ThinLTO post-link). This should fix the issues reported in https://github.com/rust-lang/rust/issues/92109 and https://github.com/llvm/llvm-project/issues/52853. These are caused by https://reviews.llvm.org/D97608.	2024-05-31 12:48:45 -07:00
Luke Lau	1cff74130f	[RISCV] Merge RISCVCoalesceVSETVLI back into RISCVInsertVSETVLI (#92869 ) We no longer need to separate the passes now that #70549 is landed and this will unblock #89089. It's not strictly NFC because it will move coalescing before register allocation when -riscv-vsetvl-after-rvv-regalloc is disabled. But this makes it closer to the original behaviour.	2024-05-29 20:59:34 +01:00
Nikita Popov	1579e9ca9c	Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331 )" This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2. This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7. Causes major compile-time regressions for unoptimized builds.	2024-05-24 08:14:26 +02:00
Nuri Amari	8cc8e5d6c6	Run ObjCContractPass in Default Codegen Pipeline (#92331 ) Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase. The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass. Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-05-23 10:04:55 -07:00
Piyou Chen	675e7bd1b9	[RISCV] Support postRA vsetvl insertion pass (#70549 ) This patch try to get rid of vsetvl implict vl/vtype def-use chain and improve the register allocation quality by moving the vsetvl insertion pass after RVV register allocation It will gain the benefit for the following optimization from 1. unblock scheduler's constraints by removing vl/vtype def-use chain 2. Support RVV re-materialization 3. Support partial spill This patch add a new option `-riscv-vsetvl-after-rvv-regalloc=<1\|0>` to control this feature and default set as disable.	2024-05-21 14:42:55 +08:00
Luke Lau	1a58e88690	[RISCV] Move RISCVInsertVSETVLI to after phi elimination (#91440 ) Split off from #70549, this patch moves RISCVInsertVSETVLI to after phi elimination where we exit SSA and need to move to LiveVariables. The motivation for splitting this off is to avoid the large scheduling diffs from moving completely to after regalloc, and instead focus on converting the pass to work on LiveIntervals. The two main changes required are updating VSETVLIInfo to store VNInfos instead of MachineInstrs, which allows us to still check for PHI defs in needVSETVLIPHI, and fixing up the live intervals of any AVL operands after inserting new instructions. On O3 the pass is inserted after the register coalescer, otherwise we end up with a bunch of COPYs around eliminated PHIs that trip up needVSETVLIPHI. Co-authored-by: Piyou Chen <piyou.chen@sifive.com>	2024-05-15 11:44:32 +08:00
Luke Lau	0ebe48f068	[RISCV] Move RISCVInsertVSETVLI after CSR/VXRM passes (#91701 ) This further splits off #91440 to inch RISCVInsertVSETVLI closer to post vector regalloc. As noted in #91440, most of the diffs are from moving vsetvli insertion after the vxrm/csr insertion passes, but these are getting conflated with the changes from moving to LiveIntervals. One idea was that we could try and remove some of these diffs by manually moving back the vsetvlis past the vxrm/csr instructions. But this meant having to touch up the LiveIntervals again which seemed to lead to even more diffs. This instead just moves RISCVInsertVSETVLI after RISCVInsertReadWriteCSR and RISCVInsertWriteVXRM so we can isolate those changes.	2024-05-10 14:31:43 +08:00
Luke Lau	af82d01fbb	Reapply "[RISCV] Separate doLocalPostpass into new pass and move to post vector regalloc (#88295 )" The original commit was calling shrinkToUses on an interval for a virtual register whose def was erased. This fixes it by calling shrinkToUses first and removing the interval if we erase the old VL def.	2024-04-25 00:42:30 +08:00
Luke Lau	fc13353e10	Revert "[RISCV] Separate doLocalPostpass into new pass and move to post vector regalloc (#88295 )" Seems to cause an address sanitizer failure on one of the buildbots related to live intervals.	2024-04-24 23:27:01 +08:00
Luke Lau	603ba4c596	[RISCV] Separate doLocalPostpass into new pass and move to post vector regalloc (#88295 ) This patch splits off part of the work to move vsetvli insertion to post regalloc in #70549. The doLocalPostpass operates outside of RISCVInsertVSETVLI's dataflow, so we can move it to its own pass. We can then move it to post vector regalloc which should be a smaller change. A couple of things that are different from #70549: - This manually fixes up the LiveIntervals rather than recomputing it via createAndComputeVirtRegInterval. I'm not sure if there's much of a difference with either. - For the postpass it's sufficient enough to just check isUndef() in hasUndefinedMergeOp, i.e. we don't need to lookup the def in VNInfo. Running on llvm-test-suite and SPEC CPU 2017 there aren't any changes in the number of vsetvlis removed. There are some minor scheduling diffs as well as extra spills and less spills in some cases (caused by transient vsetvlis existing between RISCVInsertVSETVLI and RISCVCoalesceVSETVLI when vec regalloc happens), but they are minor and should go away once we finish moving the rest of RISCVInsertVSETVLI. We could also potentially turn off this pass for unoptimised builds.	2024-04-24 16:31:40 +08:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Piyou Chen	d0a39e617b	[RISCV] default enable splitting regalloc between RVV and other (#72950 ) This patch make riscv-split-regalloc as true by default. It will not affect the codegen result if it vector register allocation doesn't exist. If there is the vector register allocation, it may affect the non-rvv register LiveInterval's segment/weight. It will make the allocation in a different order.	2023-11-30 21:12:46 -06:00
Craig Topper	014390d937	[RISCV] Implement cross basic block VXRM write insertion. (#70382 ) This adds a new pass to insert VXRM writes for vector instructions. With the goal of avoiding redundant writes. The pass does 2 dataflow algorithms. The first is a forward data flow to calculate where a VXRM value is available. The second is a backwards dataflow to determine where a VXRM value is anticipated. Finally, we use the results of these two dataflows to insert VXRM writes where a value is anticipated, but not available. The pass does not split critical edges so we aren't always able to eliminate all redundancy. The pass will only insert vxrm writes on paths that always require it.	2023-11-02 14:09:27 -07:00
Craig Topper	109aa586f0	[RISCV] Add an experimental pseudoinstruction to represent a rematerializable constant materialization sequence. (#69983 ) Rematerialization during register allocation is currently limited to a single instruction with no inputs. This patch introduces a pseudoinstruction that represents the materialization of a constant. I've started with a sequence of 2 instructions for now, which covers at least the common LUI+ADDI(W) case. This instruction will be expanded into real instructions immediately after register allocation using a new pass. This gives the post-RA scheduler a chance to separate the 2 instructions to improve ILP. I believe this matches the approach used by AArch64. Unfortunately, this loses some CSE opportunies when an LUI value is used by multiple constants with different LSBs. This feature is off by default and a new backend command line option is added to enable it for testing. This avoids the spill and reloads reported in #69586.	2023-10-25 17:20:32 -07:00
Philip Reames	a63bd7e99b	[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG does CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282. This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers. We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility. Differential Revision: https://reviews.llvm.org/D156909	2023-08-14 12:57:38 -07:00
Sami Tolvanen	83835e22c7	[RISCV] Implement KCFI operand bundle lowering With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-specific lowering added in D119296, implement KCFI operand bundle lowering for RISC-V. This patch disables the generic KCFI pass for RISC-V in Clang, and adds the KCFI machine function pass in `RISCVPassConfig::addPreSched` to emit target-specific `KCFI_CHECK` pseudo instructions before calls that have KCFI operand bundles. The machine function pass also bundles the instructions to ensure we emit the checks immediately before the calls, which is not possible with the generic pass. `KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a contiguous code sequence that traps if the expected hash in the operand bundle doesn't match the hash before the target function address. This patch emits an `ebreak` instruction for error handling to match the Linux kernel's `BUG()` implementation. Just like for X86, we also emit trap locations to a `.kcfi_traps` section to support error handling, as we cannot embed additional information to the trap instruction itself. Relands commit 62fa708ceb027713b386c7e0efda994f8bdc27e2 with fixed tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D148385	2023-06-23 22:57:56 +00:00
Sami Tolvanen	e809ebeb6c	Revert "[RISCV] Implement KCFI operand bundle lowering" This reverts commit 62fa708ceb027713b386c7e0efda994f8bdc27e2. Reverting to investigate -verify-machineinstrs errors in MIR tests.	2023-06-23 21:42:57 +00:00
Sami Tolvanen	62fa708ceb	[RISCV] Implement KCFI operand bundle lowering With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-specific lowering added in D119296, implement KCFI operand bundle lowering for RISC-V. This patch disables the generic KCFI pass for RISC-V in Clang, and adds the KCFI machine function pass in `RISCVPassConfig::addPreSched` to emit target-specific `KCFI_CHECK` pseudo instructions before calls that have KCFI operand bundles. The machine function pass also bundles the instructions to ensure we emit the checks immediately before the calls, which is not possible with the generic pass. `KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a contiguous code sequence that traps if the expected hash in the operand bundle doesn't match the hash before the target function address. This patch emits an `ebreak` instruction for error handling to match the Linux kernel's `BUG()` implementation. Just like for X86, we also emit trap locations to a `.kcfi_traps` section to support error handling, as we cannot embed additional information to the trap instruction itself. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D148385	2023-06-23 18:25:24 +00:00
eopXD	7c8365121a	[2/3][RISCV][POC] Model vxrm in LLVM intrinsics and machine instructions for RVV fixed-point instructions Depends on D151395. This is the 2nd patch of the patch-set. For the cover letter of the patch-set, please checkout D151395. This patch originates from D121376. This commit models vxrm by adding an immediate operand into intrinsics and machine instructions of RVV fixed-point instruction `vaadd`, `vaaddu`, `vasub`, and `vasubu`. This commit only covers intrinsics of the four instructions, the proceeding patches of the patch-set will do the same to other RVV fixed-point instructions. The current naiive approach is to have a write to vxrm inserted before every fixed-point instruction. This is done by the new added pass `RISCVInsertReadWriteCSR`. The reason to name the pass in a more general term is because we will also model rounding mode for the RVV floating- point instructions. The approach will be improved in the future, implementing partial redundancy elimination algorithms to it. The original LLVM intrinsics and machine instructions, take `vaadd` as an example, does not model the rounding mode is not removed in this patch. That is, `int.riscv.vaadd.` co-exists with `int.riscv.vaadd.rm.` after this patch. The next patch will add C intrinsics of vaadd with an additional operand that models the control of the rounding mode, in this patch, `int.riscv.vaadd.rm.` will replace `int.riscv.vaadd.`. Authored-by: ShihPo Hung <shihpo.hung@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D151396	2023-06-20 11:07:01 -07:00
Craig Topper	13fe673301	[RISCV] Move NTLH hint emission into RISCVAsmPrinter.cpp. Rather than having a separate pass to add the hint instructions, emit them directly into the streamer during asm printing. Reviewed By: BeMg, kito-cheng Differential Revision: https://reviews.llvm.org/D149511	2023-05-01 12:05:18 -07:00
Piyou Chen	8d7c865c2e	[RISCV] Support __builtin_nontemporal_load/store by MachineMemOperand Differential Revision: https://reviews.llvm.org/D143361	2023-04-05 22:57:49 -07:00
Craig Topper	0f4c9c016c	[RISCV] Replace RISCV->RISC-V in strings. To be consistent with RISC-V branding guidelines https://riscv.org/about/risc-v-branding-guidelines/ Think we should be using RISC-V where possible. D146449 already updated comments. Strings may have more user impact. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146451	2023-03-27 09:50:17 -07:00
Nick Desaulniers	a3a84c9e25	[llvm] add CallBrPrepare pass to pipelines Capstone of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Clang changes are still necessary to enable the use of outputs along indirect edges of asm goto statements. Link: https://github.com/llvm/llvm-project/issues/53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140180	2023-02-16 17:58:34 -08:00
OCHyams	99c12afeb4	[Assignment Tracking] Fix tests for buildbot failure (2) Follow-up for 4ece50737d5385fb80cfa23f5297d1111f8eed39 (D142027). Assignment Tracking Analysis now always runs and is skipped internally if assignment tracking is disabled. Update these tests to expect to see the pass run. Buildbot failure: https://lab.llvm.org/buildbot/#/builders/57/builds/24094	2023-01-20 15:58:35 +00:00
Paul Kirth	557a5bc336	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-19 01:51:14 +00:00
Paul Kirth	fdc0bf6adc	Revert "[codegen] Add StackFrameLayoutAnalysisPass" This breaks on some AArch64 bots This reverts commit 0a652c540556a118bbd9386ed3ab7fd9e60a9754.	2023-01-13 22:59:36 +00:00
Paul Kirth	0a652c5405	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-13 20:52:48 +00:00
Dmitry Vyukov	dbe8c2c316	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-12-05 14:40:31 +01:00
Freddy Ye	89f36dd8f3	[X86] Add ExpandLargeFpConvert Pass and enable for X86 As stated in https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528, this implementation is very similar to ExpandLargeDivRem, which expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions with a bitwidth above a threshold into auto-generated functions. This is useful for targets like x86_64 that cannot lower fp convertions with more than 128 bits. The expanded nodes are referring from the IR generated by `compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`, and etc. Corner cases: 1. For fp16: as there is no related builtins added in compliler-rt. So I mainly utilized the fp32 <-> fp16 lib calls to implement. 2. For fp80: as this pass is soft fp emulation and no fp80 instructions can help in this problem. I recommend users to deprecate this usage. For now, the implementation uses fp128 as the temporary conversion type and inserts fptrunc/ext at top/end of the function. 3. For bf16: as clang FE currently doesn't support bf16 algorithm operations (convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for now. 4. For unsigned FPToI: since both default hardware behaviors and libgcc are ignoring "returns 0 for negative input" spec. This pass follows this old way to ignore unsigned FPToI. See this example: https://gcc.godbolt.org/z/bnv3jqW1M The end-to-end tests are uploaded at https://reviews.llvm.org/D138261 Reviewed By: LuoYuanke, mgehre-amd Differential Revision: https://reviews.llvm.org/D137241	2022-12-01 13:47:43 +08:00
Marco Elver	b95646fe70	Revert "Use-after-return sanitizer binary metadata" This reverts commit d3c851d3fc8b69dda70bf5f999c5b39dc314dd73. Some bots broke: - https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview - https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio	2022-11-30 23:35:50 +01:00
Dmitry Vyukov	d3c851d3fc	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-11-30 14:50:22 +01:00
Matthias Gehre	af3758d678	Fix remaining test failures for "[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64"	2022-09-06 16:38:43 +01:00
Craig Topper	e07a8155f5	[RISCV] Move Pre-RA pseudo expansion from addMachineSSAOptimization to addPreRegAlloc. addMachineSSAOptimization is skipped for -O0, but this pass is required for -O0.	2022-08-01 13:44:43 -07:00
Craig Topper	ee6267c443	[RISCV] Remove Gather/Scatter Opt from the O0 pipeline.	2022-07-17 10:58:33 -07:00
eopXD	2cadf84fc8	[RISCV] Pass OptLevel to `RISCVDAGToDAGISel` correctly Originally, `OptLevel` isn't passed into the `MachineFunctionPass`. This lets the default parameter of `SelectionDAGISel`, which is `CodeGenOpt::Default`, be passed in. OptLevelChanger captures the optimization level with the parameter, and rather not the value within `TargetMachine`. This lets the optimization be unintentionally overwriten if other value than `CodeGenOpt::Default` passed. This patch fixes this by passing the optimization level rather than using the default value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126641	2022-05-30 17:22:50 -07:00
Lewis Revill	29a5a7c6d4	[RISCV] Add pre-emit pass to make more instructions compressible When optimizing for size, this pass searches for instructions that are prevented from being compressed by one of the following: 1. The use of a single uncompressed register. 2. A base register + offset where the offset is too large to be compressed and the base register may or may not already be compressed. In the first case, if there is a compressed register available, then the uncompressed register is copied to the compressed register and its uses replaced. This is only done if there are enough uses that code size would be improved. In the second case, if a compressed register is available, then the original base register is copied and adjusted such that: new_base_register = base_register + adjustment base_register + large_offset = new_base_register + small_offset and the uses of the base register are replaced with the new base register. Again this is only done if there are enough uses for code size to be improved. This pass was authored by Lewis Revill, with large offset optimization added by Craig Blackmore. Differential Revision: https://reviews.llvm.org/D92105	2022-05-25 09:25:02 +01:00

1 2

51 Commits