llvm-project

Author	SHA1	Message	Date
Jim Lin	f46ff4c204	[NFC][regalloc] Fix typo in llvm/lib/CodeGen/AllocationOrder.h.	2025-05-02 10:11:34 +08:00
Rahul Joshi	f24606376d	[NFC][LLVM][CodeGen] Refactor MIR Printer (#137361 ) - Move `MIPrinter` class to anonymous namespace, and remove it as a friend of `MachineBasicBlock`. - Move `canPredictBranchProbabilities` to `MachineBasicBlock` and change it to use the new `BranchProbability::normalizeProbabilities` function that accepts a range, and also to use `llvm::equal()` to check equality of the two vectors. - Use `ListSeparator` to print comma separate lists instead of manual code to do that.	2025-05-01 10:00:54 -07:00
Philip Reames	2bb2f8ab49	[CodeGen] Remove experimental deferred spilling from GreedyRegAlloc (#137850 ) This experimental option was introduced in 2015 via commit 1192294, and the target hook was added in 2020 via commit 99e865b6. There does not appear to have ever been a use of this target hook in tree. This code is complicating one of the most complicated and hard to understand parts of our code base, and was an experiment introduced nearly 10 years ago. Let's get rid of it. Note that the idea described in the original patch is not neccessarily a bad one, and we might return to it someday.	2025-05-01 08:11:51 -07:00
Rahul Joshi	64f552cefa	[NFC][LLVM][CodeGen] Refactor MachineInstr operand accessors (#137261 ) - Change MachineInstr operand accessors to use `ArrayRef` internally to slice the operand array into sub-arrays. - Minor: remove unnecessary {} on `MachineInstrBuilder::add`.	2025-05-01 07:45:22 -07:00
Nicholas Guy	b6f65f07bc	[SelectionDAG] Improve type legalisation for PARTIAL_REDUCE_MLA (#130935 ) Implement proper splitting functions for PARTIAL_REDUCE_MLA ISD nodes. This makes the udot_8to64 and sdot_8to64 tests generate dot product instructions for when the new ISD nodes are used. --------- Co-authored-by: James Chesterman <james.chesterman@arm.com>	2025-05-01 15:08:46 +01:00
David Green	9b1051281e	[DAG] Use SDValue for PatFrag checks (#137519 ) If the SDNode is used it can pick up the wrong results number, for example looking at the known bits of the first result where it should be looking at the second. The SDValue is already present as the SelectCodeCommon checks move from parent to child, pass the SDValue through to CheckNodePredicate as Op so that it can use it if necessary. SDNode *N is still generated, keeping most PatFrags the same. Fixes #137274	2025-05-01 08:58:59 +01:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
mssefat	7495f92f08	[AMDGPU] Fix undefined scc register in successor block of SI_KILL terminators (#134718 ) Fix issue 131298 where an undefined $scc register causes verifier errors when using SI_KILL_F32_COND_IMM_TERMINATOR instructions. The problem occurs because the $scc register defined in a comparison before the kill terminator is used in successor blocks, but was not properly marked as live-in. This patch: - Adds code to check if SCC is used in the successor block - Adds SCC as a live-in to successor blocks - Handles both explicit and implicit uses of SCC With this patch the machine verifier no longer reports undefined $scc errors in following kill terminator instruction. Fixes #131298 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-04-30 09:02:45 -05:00
Jie Fu	6e43cdbc25	[CodeGen] Remove unused variable 'ID' (NFC) /llvm-project/llvm/lib/CodeGen/VirtRegMap.cpp:225:15: error: unused variable 'ID' [-Werror,-Wunused-variable] static char ID; ^ 1 error generated.	2025-04-30 19:15:27 +08:00
Stephen Tozer	92195f6fc8	Reapply "[DLCov] Implement DebugLoc coverage tracking (#107279 )" Reapplied after fixing the config issue that was causing issues following the previous merge. This reverts commit fdbf073a86573c9ac4d595fac8e06d252ce1469f.	2025-04-30 11:39:29 +01:00
Akshat Oke	e91cbd4f29	[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564 )	2025-04-30 14:10:46 +05:30
YunQiang Su	db859db74d	Revert "CodeGen: Add ISD::AssertNoFPClass (#135946 )" This reverts commit f0c61d2242bbc7576ca5e4137a5ea8f63e4859a9.	2025-04-30 16:16:26 +08:00
Vikram Hegde	53a8b89003	[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880 )	2025-04-30 13:11:17 +05:30
paperchalice	159628cc22	[CodeGen] Port MachineUniformityAnalysis to new pass manager (#137578 ) - Add new pass manager version of `MachineUniformityAnalysis `. - Query `TargetTransformInfo` in new pass manager version. - Use `printAsOperand` when printing machine function name	2025-04-30 10:44:06 +08:00
Sergei Barannikov	becd418626	[CGP] Despeculate ctlz/cttz with "illegal" integer types (#137197 ) The code below the removed check looks generic enough to support arbitrary integer widths. This change helps 32-bit targets avoid expensive expansion/libcalls in the case of zero input. Pull Request: https://github.com/llvm/llvm-project/pull/137197	2025-04-29 22:33:40 +03:00
Tobias Stadler	0b5daeb2e5	[GlobalISel] Fix miscompile when narrowing vector loads/stores to non-byte-sized types (#136739 ) LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, because this would require (un)packing of bits across byte boundaries. Precommit tests: #134904	2025-04-29 12:36:34 +01:00
Vikram Hegde	86d8e8d9a6	[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550 )	2025-04-29 13:13:45 +05:30
weiguozhi	b25b51eb63	[InlineSpiller] Check rematerialization before folding operand (#134015 ) Current implementation tries to fold the operand before rematerialization because it can reduce one register usage. But if there is a physical register available we can still rematerialize it without causing high register pressure. This patch do this check to find the better choice. Then we can produce xorps %xmm1, %xmm1 ucomiss %xmm1, %xmm0 instead of ucomiss LCPI0_1(%rip), %xmm0	2025-04-28 09:52:03 -07:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Paul Walker	be82be281d	[LLVM][GlobalISel] Ensure G_{F}CONSTANT only store references to scalar Constant{Int,FP}. (#137319 )	2025-04-28 11:40:39 +01:00
John Brawn	dd87127f4e	[DAGCombiner] Eliminate fp casts if we have the right fast math flags (#131345 ) When floating-point operations are legalized to operations of a higher precision (e.g. f16 fadd being legalized to f32 fadd) then we get narrowing then widening operations between each operation. With the appropriate fast math flags (nnan ninf contract) we can eliminate these casts.	2025-04-28 11:21:51 +01:00
Craig Topper	e17f07c4de	[SelectionDAG] Reduce code duplication between getStore, getTruncStore, and getIndexedStore. (#137435 ) Create an extra overload of getStore that can handle of the 3 types of stores. This is similar to how getLoad/getExtLoad/getIndexLoad is structure.	2025-04-27 22:32:53 -07:00
Owen Rodley	d3d856ad84	Clean up external users of GlobalValue::getGUID(StringRef) (#129644 ) See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801 for context. This is a non-functional change which just changes the interface of GlobalValue, in preparation for future functional changes. This part touches a fair few users, so is split out for ease of review. Future changes to the GlobalValue implementation can then be focused purely on that class. This does the following: * Rename GlobalValue::getGUID(StringRef) to getGUIDAssumingExternalLinkage. This is simply making explicit at the callsite what is currently implicit. * Where possible, migrate users to directly calling getGUID on a GlobalValue instance. * Otherwise, where possible, have them call the newly renamed getGUIDAssumingExternalLinkage, to make the assumption explicit. There are a few cases where neither of the above are possible, as the caller saves and reconstructs the necessary information to compute the GUID themselves. We want to migrate these callers eventually, but for this first step we leave them be.	2025-04-28 11:09:43 +10:00
Kazu Hirata	5cfd81b0cc	[llvm] Use range constructors of *Set (NFC) (#137552 )	2025-04-27 15:59:57 -07:00
Kazu Hirata	8210cdd764	[llvm] Use llvm::replace (NFC) (#137481 )	2025-04-26 18:18:09 -07:00
Kazu Hirata	8ba3a232d1	[llvm] Use llvm::copy (NFC) (#137470 )	2025-04-26 15:50:38 -07:00
Sergei Barannikov	bb1765179e	[TTI] Simplify implementation (NFCI) (#136674 ) Replace "concept based polymorphism" with simpler PImpl idiom. This pursues two goals: * Enforce static type checking. Previously, target implementations hid base class methods and type checking was impossible. Now that they override the methods, the compiler will complain on mismatched signatures. * Make the code easier to navigate. Previously, if you asked your favorite LSP server to show a method (e.g. `getInstructionCost()`), it would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`, `TTIImplBase`, and target overrides. Now it is two less :) There are three commits to hopefully simplify the review. The first commit removes `TTI::Model`. This is done by deriving `TargetTransformInfoImplBase` from `TTI::Concept`. This is possible because they implement the same set of interfaces with identical signatures. The first commit makes `TargetTransformImplBase` polymorphic, which means all derived classes should `override` its methods. This is done in second commit to make the first one smaller. It appeared infeasible to extract this into a separate PR because the first commit landed separately would result in tons of `-Woverloaded-virtual` warnings (and break `-Werror` builds). The third commit eliminates `TTI::Concept` by merging it with the only derived class `TargetTransformImplBase`. This commit could be extracted into a separate PR, but it touches the same lines in `TargetTransformInfoImpl.h` (removes `override` added by the second commit and adds `virtual`), so I thought it may make sense to land these two commits together. Pull Request: https://github.com/llvm/llvm-project/pull/136674	2025-04-26 15:25:40 +03:00
David Green	b9e32749d2	[GlobalISel] Clear nsw flags when converting sub to add. (#137288 ) As shown in https://alive2.llvm.org/ce/z/PVwcTL we need to clear the nsw flags too when converting a sub to a add if the constant is INT_MIN. Fixes #137254	2025-04-26 11:00:53 +01:00
Sergei Barannikov	2ae9a74bf1	[CodeGen] Use `TRI::regunits()` (NFC) (#137356 )	2025-04-26 08:49:17 +03:00
Craig Topper	c27018b35a	[SelectionDAG] Use getExtLoadVP in PromoteIntRes_VP_LOAD. NFC	2025-04-25 22:31:10 -07:00
Ulrich Weigand	be7ef6c52b	[MachineLICM] Recognize registers clobbered at EH landing pad entry (#122446 ) EH landing pad entry implicitly clobbers target-specific exception pointer and exception selector registers. The post-RA MachineLICM pass needs to take these into account when deciding whether to hoist an instruction out of the loop that initializes one of these registers. Fixes: https://github.com/llvm/llvm-project/issues/122315	2025-04-25 22:27:27 +02:00
Diana Picus	5bad5d84a1	Reland [AMDGPU] Support block load/store for CSR #130013 (#137169 ) Add support for using the existing SCRATCH_STORE_BLOCK and SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It does not include WWM registers - those will be saved and restored individually, just like before. This patch does not change the ABI. Use of this feature may lead to slightly increased stack usage, because the memory is not compacted if certain registers don't have to be transferred (this will happen in practice for calling conventions where the callee and caller saved registers are interleaved in groups of 8). However, if the registers at the end of the block of 32 don't have to be transferred, we don't need to use a whole 128-byte stack slot - we can trim some space off the end of the range. In order to implement this feature, we need to rely less on the target-independent code in the PrologEpilogInserter, so we override several new methods in SIFrameLowering. We also add new pseudos, SI_BLOCK_SPILL_V1024_SAVE/RESTORE. One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the SCRATCH_LOAD_BLOCK instructions will have all the registers that are not transferred added as implicit uses. This is done in order to inform LiveRegUnits that those registers are not available before the restore (since we're not really restoring them - so we can't afford to scavenge them). Unfortunately, this trick doesn't work with the save, so before the save all the registers in the block will be unavailable (see the unit test). This was reverted due to failures in the builds with expensive checks on, now fixed by always updating LiveIntervals and SlotIndexes in SILowerSGPRSpills.	2025-04-25 11:29:27 +02:00
Jie Fu	46f91173c5	[CodeGen] Fix -Wunused-variable in SelectionDAG.cpp (NFC) /llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7502:17: error: unused variable 'NoFPClass' [-Werror,-Wunused-variable] FPClassTest NoFPClass = static_cast<FPClassTest>(N2->getAsZExtVal()); ^ 1 error generated.	2025-04-25 14:03:09 +08:00
YunQiang Su	f0c61d2242	CodeGen: Add ISD::AssertNoFPClass (#135946 ) It is used to mark a value that we are sure that it is not some fcType. The examples include: * An arguments of a function is marked with nofpclass * Output value of an intrinsic can be sure to not be some type So that the following operation can make some assumptions. --------- Co-authored-by: Your Name <you@example.com>	2025-04-25 09:12:41 +08:00
Stephen Tozer	fdbf073a86	Revert "[DLCov] Implement DebugLoc coverage tracking (#107279 )" This reverts commit a9d93ecf1f8d2cfe3f77851e0df179b386cff353. Reverted due to the commit including a config in LLVM headers that is not available outside of the llvm source tree.	2025-04-25 00:36:28 +01:00
Vladislav Dzhidzhoev	bea3b9214e	Revert "Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks"" (#137243 ) Reverts llvm/llvm-project#137237, as the problem was fixed with 92dc18b6df043d788d77b4a98e5afa3954a44cb0.	2025-04-24 21:49:55 +02:00
Peter Collingbourne	4ed8bfd0c3	LiveRangeShrink: Early exit when encountering a code motion barrier. Without this, we end up with quadratic behavior affecting functions with large numbers of code motion barriers, such as CFI jump tables. As a drive-by cleanup, remove a redundant store to SawStore in this pass as it is also done by isSafeToMove. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/136806	2025-04-24 12:44:51 -07:00
David Blaikie	dd9f92c886	Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks" (#137237 ) Reverts llvm/llvm-project#136205 Breaks buildbots, probably something about needing to restrict the test to running on a specific target or the like - I haven't looked closely. Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>	2025-04-24 12:14:51 -07:00
Craig Topper	f261f1406d	[SelectionDAG][RISCV] Teach computeKnownBits to use range metadata for atomic_load. (#137119 ) And teach SelectionDAGBuilder to get the range metadata in visitAtomicLoad. This allows us to recognize that sign extending a byte load of a boolean value from memory will produce zeros for the extended bits. This allow us to remove an AND on RISC-V. Tests copied from #136502 with range metadata added to i1 cases. Some of the test effects overlap with #136502, but that patch can't handle the acquire or seq_cst cases with the Zalasr extension. We only have sign extending versions of those loads.	2025-04-24 12:14:05 -07:00
Stephen Tozer	a9d93ecf1f	[DLCov] Implement DebugLoc coverage tracking (#107279 ) This is part of a series of patches that tries to improve DILocation bug detection in Debugify; see the review for more details. This is the patch that adds the main feature, adding a set of `DebugLoc::get<Kind>` functions that can be used for instructions with intentionally empty DebugLocs to prevent Debugify from treating them as bugs, removing the currently-pervasive false positives and allowing us to use Debugify (in its original DI preservation mode) to reliably detect existing bugs and regressions. This patch does not add uses of these functions, except for once in Clang before optimizations, and in `Instruction::dropLocation()`, since that is an obvious case that immediately removes a set of false positives.	2025-04-24 19:41:25 +01:00
Simon Pilgrim	10f6c3e270	[DAG] visitCONCAT_VECTORS - relax legality checks (#137210 ) We can fold combineConcatVectorOfConcatVectors/combineConcatVectorOfExtracts until after vector legalization	2025-04-24 19:08:06 +01:00
Vladislav Dzhidzhoev	1143a04f34	[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined DW_TAG_lexical_blocks (#136205 ) During the discussion under https://github.com/llvm/llvm-project/pull/119001, it was noticed that concrete DW_TAG_lexical_blocks should refer to corresponding abstract DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid ambiguity. This behavior is implemented in GCC (https://godbolt.org/z/Khrzdq1Wx), but not in LLVM. Fixes https://github.com/llvm/llvm-project/issues/49297.	2025-04-24 19:44:18 +02:00
Paul Walker	d7f3c31293	Reapply "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (#135817 )" This reverts commit 427b6448a3af009e57c0142d6d8af83318b45093. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.	2025-04-24 12:44:41 +00:00
Luke Lau	f218cd28d4	[IA] Remove unused argument. NFC	2025-04-24 19:08:07 +08:00
Paul Walker	427b6448a3	Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (#135817 )" This reverts commit 15d8b3cae9debc2bd7d27ca92ff599ba9fb30da5.	2025-04-24 09:48:54 +00:00
Craig Topper	dbb0605f87	[SelectionDAG] Add NewSDValueDbgMsg to getAtomic.	2025-04-23 22:56:52 -07:00
Peter Collingbourne	dbb8434ff7	SelectionDAG: Add missing AddNodeIDCustom case for MDNodeSDNode. Without this we ended up never deduplicating MDNodeSDNodes. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/136805	2025-04-23 11:00:48 -07:00
Simon Pilgrim	79151244d6	[DAG] narrowExtractedVectorLoad - reuse existing SDLoc. NFC (#136870 )	2025-04-23 16:50:06 +01:00
Nicholas Guy	a1f369e630	[AArch64][SVE] Add dot product lowering for PARTIAL_REDUCE_MLA node (#130933 ) Add lowering in tablegen for PARTIAL_REDUCE_U/SMLA ISD nodes. Only happens when the combine has been performed on the ISD node. Also adds in check to only do the DAG combine when the node can then eventually be lowered, so changes neon tests too. --------- Co-authored-by: James Chesterman <james.chesterman@arm.com>	2025-04-23 13:19:41 +01:00

1 2 3 4 5 ...

37646 Commits