llvm-project

Author	SHA1	Message	Date
Craig Topper	5cf0fb4317	[StackSlotColoring] Ignore non-spill objects in RemoveDeadStores. (#80242 ) The stack slot coloring pass is concerned with optimizing spill slots. If any change is a pass is made over the function to remove stack stores that use the same register and stack slot as an immediately preceding load. The register check is too simple for constant registers like AArch64 and RISC-V's zero register. This register can be used as the result of a load if we want to discard the result, but still have the memory access performed. Like for a volatile or atomic load. If the code sees a load from the zero register followed by a store of the zero register at the same stack slot, the pass mistakenly believes the store isn't needed. Since the main stack coloring optimization is only concerned with spill slots, it seems reasonable that RemoveDeadStores should only be concerned with spills. Since we never generate a reload of x0, this avoids the issue seen by RISC-V. Test case concept is adapted from pr30821.mir from X86. That test had to be updated to mark the stack slot as a spill slot. Fixes #80052.	2024-02-01 13:25:15 -08:00
Jiahan Xie	10c2d5ff7c	[RISCV][GISel] RegBank select and instruction select for vector G_ADD, G_SUB (#74114 ) RegisterBank Selection for scalable vector G_ADD and G_SUB by creating new mappings for different types of vector register banks. Then implement Instruction Selection for the same operations by choosing the correct RISC-V vector register class.	2024-02-01 15:06:43 -05:00
Quentin Dian	112fba974c	[MIRPrinter] Don't print line break when there is no instructions (NFC) (#80147 ) Per #80143, we can remove the extra line break when there is no instruction.	2024-02-01 22:10:52 +08:00
Kazu Hirata	39fa304866	[llvm] Use StringRef::starts_with (NFC)	2024-01-31 23:54:07 -08:00
wangpc	995d21bc6f	[SelectOpt] Print instruction instead of pointer Pull Request: https://github.com/llvm/llvm-project/pull/80125	2024-02-01 13:10:52 +08:00
Zaara Syeda	a03a6e9964	[AIX] [XCOFF] Add support for common and local common symbols in the TOC (#79530 ) This patch adds support for common and local symbols in the TOC for AIX. Note that we need to update isVirtualSection so as a common symbol in TOC will have the symbol type XTY_CM and will be initialized when placed in the TOC so sections with this type are no longer virtual. --------- Co-authored-by: Zaara Syeda <syzaara@ca.ibm.com>	2024-01-31 16:34:21 -05:00
Jay Foad	baf1b19763	[CodeGen] Use regunits instead of MCRegUnitIterator in RegisterClassInfo. NFC.	2024-01-31 16:27:54 +00:00
Jay Foad	e34fd2e193	[CodeGen] Simplify RegisterClassInfo BitVector comparisons. NFC.	2024-01-31 16:25:19 +00:00
Nikita Popov	f2df4bfe54	[AsmParser] Support non-consecutive global value numbers (#80013 ) https://github.com/llvm/llvm-project/pull/78171 added support for non-consecutive local value numbers. This extends the support for global value numbers (for globals and functions). This means that it is now possible to delete an unnamed global definition/declaration without breaking the IR. This is a lot less common than unnamed local values, but it seems like something we should support for consistency. (Unnamed globals are used a lot in Rust though.)	2024-01-31 17:04:30 +01:00
Quentin Dian	b7738e275d	[MIRPrinter] Don't print space when there is no successor (#80143 ) Extra space causes the checks generated by update_mir_test_checks to be unavailable. ``` # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4 # RUN: llc -mtriple=x86_64-- -o - %s -run-pass=none -verify-machineinstrs -simplify-mir \| FileCheck %s --- name: foo body: \| ; CHECK-LABEL: name: foo ; CHECK: bb.0: ; CHECK-NEXT: successors: ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: bb.1: ; CHECK-NEXT: RET 0, $eax bb.0: successors: bb.1: RET 0, $eax ... ``` The failure log is as follows: ``` llvm/test/CodeGen/MIR/X86/unreachable-block-print.mir:9:16: error: CHECK-NEXT: is on the same line as previous match ; CHECK-NEXT: {{ $}} ^ <stdin>:21:13: note: 'next' match was here successors: ^ <stdin>:21:13: note: previous match ended here successors: ```	2024-01-31 22:35:41 +08:00
Simon Pilgrim	912cdd2179	[DAG] AddNodeIDCustom - call ShuffleVectorSDNode::getMask once instead of repeated getMaskElt calls. Use a simpler for-range loop to append all shuffle mask elements	2024-01-31 12:01:01 +00:00
Jay Foad	942cc9a222	Revert "[CodeGen] Don't include aliases in RegisterClassInfo::IgnoreCSRForAllocOrder (#80015 )" This reverts commit f8525030004f907cd108e7c18df255a6d3b23124. It was supposed to speed things up but llvm-compile-time-tracker.com showed a slight slow down.	2024-01-31 10:25:51 +00:00
Jay Foad	f852503000	[CodeGen] Don't include aliases in RegisterClassInfo::IgnoreCSRForAllocOrder (#80015 ) Previously we called ignoreCSRForAllocationOrder on every alias of every CSR which was expensive on targets like AMDGPU which define a very large number of overlapping register tuples. On such targets it is simpler and faster to call ignoreCSRForAllocationOrder once for every physical register. Differential Revision: https://reviews.llvm.org/D146735	2024-01-31 08:16:06 +00:00
Oskar Wirga	ff4636a4ab	Refactor recomputeLiveIns to converge on added MachineBasicBlocks (#79940 ) This is a fix for the regression seen in https://github.com/llvm/llvm-project/pull/79498 > Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. Now we do not recompute the entire CFG but we do ensure that the newly added MBB do reach convergence.	2024-01-30 19:33:04 -08:00
PiJoules	a356e6ccad	[SelectionDAG] Expand fixed point multiplication into libcall (#79352 ) 32-bit ARMv6 with thumb doesn't support MULHS/MUL_LOHI as legal/custom nodes during expansion which will cause fixed point multiplication of _Accum types to fail with fixed point arithmetic. Prior to this, we just happen to use fixed point multiplication on platforms that happen to support these MULHS/MUL_LOHI. This patch attempts to check if the multiplication can be done via libcalls, which are provided by the arm runtime. These libcall attempts are made elsewhere, so this patch refactors that libcall logic into its own functions and the fixed point expansion calls and reuses that logic.	2024-01-30 13:58:55 -08:00
Jay Foad	77e5136ce4	[CodeGen] Use RegUnits in RegisterClassInfo::getLastCalleeSavedAlias (#79996 ) Change the implementation of getLastCalleeSavedAlias to use RegUnits instead of register aliases. This is much faster on targets like AMDGPU which define a very large number of overlapping register tuples. No functional change intended. If PhysReg overlaps multiple CSRs then getLastCalleeSavedAlias(PhysReg) could conceivably return a different arbitrary one, but currently it is only used for some debug printing anyway. Differential Revision: https://reviews.llvm.org/D146734	2024-01-30 14:06:45 +00:00
Liao Chunyu	45188c64db	[DAGCombiner] Use generalized pattern matcher in foldBoolSelectToLogic (#79101 ) support vp.select TODO: Possibly other functions could be supported, eg: SimplifySelect()	2024-01-30 10:26:51 +08:00
chuongg3	2c552d319a	[AArch64][GlobalISel] Legalize G_ABS for Larger/Smaller Vectors (#79117 ) Legalize G_ABS for larger/smaller width vectors with legal element sizes Fallsback for the smaller width vector tests because it is unable to legalize for G_ANYEXT smaller width vectors	2024-01-28 20:21:38 +00:00
David Green	f297d0bc6d	[AArch64][GlobalISel] More FCmp legalization. (#78734 ) This fills out the fcmp handling to be more like the other instructions, adding better support for fp16 and some larger vectors. Select of f16 values is still not handled optimally in places as the select is only legal for s32 values, not s16. This would be correct for integer but not necessarily for fp. It is as if we need to do legalization -> regbankselect -> extra legaliation -> selection.	2024-01-28 15:42:36 +00:00
Simon Pilgrim	b13d5df84c	[DAG] ComputeKnownBits - use KnownBits::usub_sat instead of a custom variant KnownBits::usub_sat is already exhaustively tested in the unit tests	2024-01-28 13:06:57 +00:00
Kazu Hirata	faf555f93f	Revert "[DAGCombiner] Use SmallDenseMap (NFC) (#79681 )" This reverts commit 863b2c84c0fbcfb02d969fa36af4932d410a827b. A compile-time regression has been reported: https://github.com/llvm/llvm-project/pull/79681#issuecomment-1913325915	2024-01-27 19:29:47 -08:00
Kazu Hirata	863b2c84c0	[DAGCombiner] Use SmallDenseMap (NFC) (#79681 ) The use of SmallDenseMap saves 0.48% of heap allocations during the compilation of a large preprocessed file, namely X86ISelLowering.cpp, for the X86 target. During the experiment, the maximum size of WorklistMap was 24 or less 74% of the time. (Note that DenseMap has the maximum occupancy rate of 3/4.)	2024-01-27 08:46:02 -08:00
Kazu Hirata	f2e69d2e85	[CodeGen] Use a range-based for loop (NFC)	2024-01-26 23:46:27 -08:00
Nikita Popov	07a1925b8b	Revert "Refactor recomputeLiveIns to operate on whole CFG (#79498 )" This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b. Introduces a major compile-time regression.	2024-01-26 22:33:17 +01:00
Oskar Wirga	59bf60519f	Refactor recomputeLiveIns to operate on whole CFG (#79498 ) Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness.	2024-01-26 11:25:36 -08:00
David Green	7f518ee9ea	[DAG] Add a one-use check to concat -> scalar_to_vector fold. (#79510 ) Without this we can end up with multiple copies from gpr->fpr.	2024-01-26 18:17:17 +00:00
Kai Nacke	f2d0bba874	[GISel] Lower scalar G_SELECT in LegalizerHelper (#79342 ) The LegalizerHelper only has support to lower G_SELECT with vector operands. The approach is the same for scalar arguments, which this PR adds.	2024-01-26 09:11:29 -05:00
Shengchen Kan	550f0eb2ce	[NFC] Rename TargetInstrInfo::FoldImmediate to TargetInstrInfo::foldImmediate and simplify implementation for X86	2024-01-26 20:50:58 +08:00
Nico Weber	184ca39529	[llvm] Move CodeGenTypes library to its own directory (#79444 ) Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.	2024-01-25 12:01:31 -05:00
paperchalice	12a8bc09ca	[CodeGen] Port FreeMachineFunction to new pass manager (#79421 ) This pass should be the last machine function pass in pipeline, also ignore `PI.runAfterPass(*P, MF, PassPA);` to avoid accessing a dangling reference.	2024-01-25 17:24:05 +08:00
paperchalice	e390c229a4	[Pass] Add hyphen to some pass names (#74287 ) Here is the list of the renamed passes: - `callbrprepare` -> `callbr-prepare` - `dwarfehprepare` -> `dwarf-eh-prepare` - `flattencfg` -> `flatten-cfg` - `loweratomic` -> `lower-atomic` - `lowerinvoke` -> `lower-invoke` - `lowerswitch` -> `lower-switch` - `winehprepare` -> `win-eh-prepare` - `targetir` -> `target-ir` - `targetlibinfo` -> `target-lib-info` Legacy passes are not affected.	2024-01-25 16:05:54 +08:00
Kazu Hirata	a13b7df7f2	[CodeGen] Use llvm::successors (NFC)	2024-01-24 22:11:56 -08:00
Michael Maitland	d2d42dcfde	[CodeGen][MISched] Rename instance of Cycle -> ReleaseAtCycle b1ae461a5358932851de42b66ffde8748da51a83 renamed Cycle -> ReleaseAtCycle. 7e09239e24b339f45f63a670e2e831150826bf70 was committed without rebasing but used the old Cycle syntax. This caused a build failure when 7e09239e24b339f45f63a670e2e831150826bf70 was squash-and-merged. This patch fixes this problem.	2024-01-24 10:54:14 -08:00
Michael Maitland	7e09239e24	[CodeGen][MISched] Handle empty sized resource usage. (#75951 ) TargetSchedule.td explicitly allows the usage of a ProcResource for zero cycles, in order to represent that the ProcResource must be available but is not consumed by the instruction. On the other hand, ResourceSegments explicitly does not allow for a zero sized interval. In order to remedy this, this patch handles the special case of when there is an empty interval usage of a resource by not adding an empty interval. We ran into this issue downstream, but it makes sense to have this upstream since it is explicitly allowed by TargetSchedule.td.	2024-01-24 13:40:23 -05:00
Petar Avramovic	c46109d0d7	Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#79274 ) Reverts llvm/llvm-project#78482	2024-01-24 12:18:34 +01:00
Petar Avramovic	91ddcba83a	AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#78482 ) Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper. Use machine uniformity analysis to find divergent i1 phis and select them as lane mask phis in same way SILowerI1Copies select VReg_1 phis. Note that divergent i1 phis include phis created by LCSSA and all cases of uses outside of cycle are actually covered by "lowering LCSSA phis". GlobalISel lane masks are registers with sgpr register class and S1 LLT. TODO: General goal is that instructions created in this pass are fully instruction-selected so that selection of lane mask phis is not split across multiple passes. patch 3 from: https://github.com/llvm/llvm-project/pull/73337	2024-01-24 11:58:32 +01:00
paperchalice	7251243315	[CodeGen][Passes] Move `CodeGenPassBuilder.h` to Passes (#79242 ) `CodeGenPassBuilder` is not very tightly coupled to CodeGen, it may need to reference some method in pass builder in future, so move `CodeGenPassBuilder.h` to Passes.	2024-01-24 11:29:18 +08:00
paperchalice	7e50f006f7	[NewPM][CodeGen][llc] Add NPM support (#70922 ) Add new pass manager support to `llc`. Users can use `--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm` to run default codegen pipeline. This patch is taken from [D83612](https://reviews.llvm.org/D83612), the original author is @yuanfang-chen. --------- Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>	2024-01-24 09:27:25 +08:00
Aiden Grossman	b1778c7d7b	[AsmPrinter] Remove mbb-profile-dump flag (#76595 ) Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF sections has landed, there is no longer a need to keep around the mbb-profile-dump flag.	2024-01-23 16:48:10 -08:00
Paul Kirth	03a61d34eb	[RISCV] Support TLSDESC in the RISC-V backend (#66915 ) This patch adds basic TLSDESC support in the RISC-V backend. Specifically, we add new relocation types for TLSDESC, as prescribed in https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a new pseudo instruction to simplify code generation. This patch does not try to optimize the local dynamic case, which can be improved in separate patches. Linker side changes will also be handled separately. The current implementation is only enabled when passing the new `-enable-tlsdesc` codegen flag.	2024-01-23 16:16:07 -08:00
Kazu Hirata	8ed1291d96	[MachineCopyPropagation] Make a SmallVector larger (NFC) (#79106 ) This patch makes a SmallVector slightly larger. We encounter quite a few instructions with 3 or 4 defs but very few beyond that on X86. This saves 0.39% of heap allocations during the compilation of a large preprocessed file, namely X86ISelLowering.cpp, for the X86 target.	2024-01-23 09:27:18 -08:00
Simon Pilgrim	e1aa5b1fd1	[DAG] visitSCALAR_TO_VECTOR - don't fold scalar_to_vector(bin(extract(x),extract(y)) -> bin(x,y) if extracts have other uses Fixes #78897 - although the test case still has a number of poor codegen issues (in particular for i686 triples) that will need addressing (combining the nodes in topological order should help).	2024-01-23 16:28:43 +00:00
Jeremy Morse	087172258a	[DebugInfo][RemoveDIs] Handle non-instr debug-info in GlobalISel (#75228 ) The RemoveDIs project is aiming to eliminate debug intrinsics like dbg.value and dbg.declare from LLVM, and replace them with DPValue objects attached to instructions. ISel is one of the "terminals" where that information needs to be converted into MIR format: this patch implements support for that in GlobalISel. We aim for the output of LLVM to be identical with/without RemoveDIs debug-info. This patch should be NFC, as we're handling the same data about variables stored in a different format -- it now appears in a DPValue object rather than as an intrinsic. To that end, I've refactored the handling of dbg.values into a dedicated function, and call it whenever a dbg.value or a DPValue is encountered. dbg.declare is handled in a similar way. Testing: adding the --try-experimental-debuginfo-iterators switch to llc causes it to try and convert to the "new" debug-info format if it's built in (LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS=On), and it'll be covered by our buildbot. One test has a few extra wildcard-regexes added: this is because there's some extra data printed about attached debug-info, which is safe to ignore.	2024-01-23 15:04:08 +00:00
Stephen Tozer	30845e8ab4	[RemoveDIs][DebugInfo] Handle DPVAssigns in Assignment Tracking excluding lowering (#78982 ) This patch adds support for DPVAssigns across all of AssignmentTrackingAnalysis except for AssignmentTrackingLowering, which is implemented in a separate patch. This patch includes handling DPValues in MemLocFragFill, the removal of redundant DPValues as part of AssignmentTrackingAnalysis (which is different to the version in `BasicBlockUtils.cpp`), and preventing the DPVAssigns from being directly emitted in SelectionDAG (just as we don't emit llvm.dbg.assigns directly, but receive a set of locations from AssignmentTrackingAnalysis' output).	2024-01-23 14:27:01 +00:00
Anatoly Trosinenko	10bd69a4f7	[MachineOutliner] Refactor iterating over Candidate's instructions (#78972 ) Make Candidate's front() and back() functions return references to MachineInstr and introduce begin() and end() returning iterators, the same way it is usually done in other container-like classes. This makes possible to iterate over the instructions contained in Candidate the same way one can iterate over MachineBasicBlock (note that begin() and end() return bundled iterators, just like MachineBasicBlock does, but no instr_begin() and instr_end() are defined yet).	2024-01-23 17:21:40 +03:00
Stephen Tozer	5266543284	[RemoveDIs][DebugInfo] Handle DPVAssigns in AssignmentTrackingLowering (#78980 ) Following on from the previous patch 6aeb7a7, this patch adds the necessary code to process the DPV equivalents of llvm.dbg.assign intrinsics. Most of the content of this patch is simply duplicating existing functionality, using generic code for simple functions and PointerUnions where storage is required. The most complex changes are in the places that iterate over instructions, as iterating over DPValues between instructions is different to iterating over instructions that may or may not be debug intrinsics; this is most complex in `AssignmentTrackingLowering::process`, where I've added some comments to explain the state of the program at each key point depending on whether we are operating on intrinsics or DPValues.	2024-01-23 12:32:24 +00:00
Yi Kong	3ea92ea2f9	Fix MFS warning format WithColor::warning() does not append new line automatically.	2024-01-23 17:01:23 +09:00
Eli Friedman	a6065f0fa5	Arm64EC entry/exit thunks, consolidated. (#79067 ) This combines the previously posted patches with some additional work I've done to more closely match MSVC output. Most of the important logic here is implemented in AArch64Arm64ECCallLowering. The purpose of the AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for other targets, and generate most of the Arm64EC-specific bits: generating thunks, mangling symbols, generating aliases, and generating the .hybmp$x table. This is all done late for a few reasons: to consolidate the logic as much as possible, and to ensure the IR exposed to optimization passes doesn't contain complex arm64ec-specific constructs. The other changes are supporting changes, to handle the new constructs generated by that pass. There's a global llvm.arm64ec.symbolmap representing the .hybmp$x entries for the thunks. This gets handled directly by the AsmPrinter because it needs symbol indexes that aren't available before that. There are two new calling conventions used to represent calls to and from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few changes to handle the associated exception-handling info, SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX. I've intentionally left out handling for structs with small non-power-of-two sizes, because that's easily separated out. The rest of my current work is here. I squashed my current patches because they were split in ways that didn't really make sense. Maybe I could split out some bits, but it's hard to meaningfully test most of the parts independently. Thanks to @dpaoliello for extensive testing and suggestions. (Originally posted as https://reviews.llvm.org/D157547 .)	2024-01-22 21:28:07 -08:00
Simeon K	58cfd56356	[VP][RISCV] Introduce llvm.vp.minimum/maximum intrinsics (#74840 ) Although there are predicated versions of minnum/maxnum, the ones for minimum/maximum are currently missing. This patch introduces these intrinsics and implements their lowering to RISC-V.	2024-01-22 16:46:39 -08:00
David Green	a2d68b4bec	[SelectOpt] Add handling for Select-like operations. (#77284 ) Some operations behave like selects. For example `or(zext(c), y)` is the same as select(c, y\|1, y)` and instcombine can canonicalize the select to the or form. These operations can still be worthwhile converting to branch as opposed to keeping as a select or or instruction. This patch attempts to add some basic handling for them, creating a SelectLike abstraction in the select optimization pass. The backend can opt into handling `or(zext(c),x)` as a select if it could be profitable, and the select optimization pass attempts to handle them in much the same way as a `select(c, x\|1, x)`. The Or(x, 1) may need to be added as a new instruction, generated as the or is converted to branches. This helps fix a regression from selects being converted to or's recently.	2024-01-22 23:46:58 +00:00

1 2 3 4 5 ...

35262 Commits