llvm-project

Author	SHA1	Message	Date
Fangrui Song	78bac7f0a6	[MC] Remove unneeded getMemtagAttr()	2025-02-23 10:56:59 -08:00
Fangrui Song	34387fc63b	[AsmPrinter] Simplify $local after D131429. NFC setType is unneeded (and AsmPrinter tries not to modify symbols). AsmPrinter. MCSA_ELF_TypeFunction is available on all targets using getSymbolPreferLocal. Pull Request: https://github.com/llvm/llvm-project/pull/128138	2025-02-23 10:19:27 -08:00
Craig Topper	228dbd254a	[RegAllocGreedy] Use MCRegister instead of Register for functions that return a physical register. The callers of these functions return the value as an MCRegister so this removes some casts from Register to MCRegister.	2025-02-22 21:39:25 -08:00
Craig Topper	0bd66c4194	[RegAllocGreedy] Remove unnecessary conversion from MCRegister to Register. NFC	2025-02-22 16:20:19 -08:00
Craig Topper	6fe780ce63	[RegAllocGreedy] Use Register() instead of 0 for invalid Register. NFC	2025-02-22 16:20:19 -08:00
Kazu Hirata	34172bba11	[CodeGen] Avoid repeated hash lookups (NFC) (#128300 )	2025-02-22 02:24:35 -08:00
Craig Topper	444859634f	[MachineVerifier] Use Register instead of unsigned for DenseSet key. NFC (#128234 )	2025-02-21 21:25:21 -08:00
Yingwei Zheng	646e4f2eed	[DAGCombiner] visitFREEZE: Early exit when N is deleted (#128161 ) `N` may get merged with existing nodes inside the loop. Early exit when it is deleted to avoid the crash. Alternative solution: use `DAGNodeDeletedListener` to refresh the value of N. Closes https://github.com/llvm/llvm-project/issues/128143.	2025-02-22 12:06:34 +08:00
Matt Arsenault	1bb43068f1	PeepholeOpt: Allow introducing subregister uses on reg_sequence (#127052 ) This reverts d246cc618adc52fdbd69d44a2a375c8af97b6106. We now handle composing subregister extracts through reg_sequence.	2025-02-22 09:16:14 +07:00
Kazu Hirata	b044350701	[CodeGen] Avoid repeated hash lookups (NFC) (#128126 ) This patch eliminates repeated hash lookups at three levels: - RegToSlotIdx of DenseMap - Reloads of DenseMap - Reloads[MBB] of SmallSet	2025-02-21 11:07:07 -08:00
Matt Arsenault	0c50054820	Revert "RegAlloc: Fix verifier error after failed allocation (#119690 )" This reverts commit 34167f99668ce4d4d6a1fb88453a8d5b56d16ed5. Different set of verifier errors appears after other regalloc failure tests with EXPENSIVE_CHECKS.	2025-02-22 00:23:21 +07:00
Matt Arsenault	34167f9966	RegAlloc: Fix verifier error after failed allocation (#119690 ) In some cases after reporting an allocation failure, this would fail the verifier. It picks the first allocatable register and assigns it, but didn't update the liveness appropriately. When VirtRegRewriter relied on the liveness to set kill flags, it would incorrectly add kill flags if there was another overlapping kill of the virtual register. We can't properly assign the register to an overlapping range, so break the liveness of the failing register (and any other interfering registers) instead. Give the virtual register dummy liveness by effectively deleting all the uses by setting them to undef. The edge case not tested here which I'm worried about is if the read of the register is a def of a subregister. I've been unable to come up with a test where this occurs. https://reviews.llvm.org/D122616	2025-02-21 22:11:51 +07:00
Craig Topper	af64f0a6c2	[FrameLowering] Use MCRegister instead of Register in CalleeSavedInfo. NFC (#128095 ) Callee saved registers should always be phyiscal registers. They are often passed directly to other functions that take MCRegister like getMinimalPhysRegClass or TargetRegisterClass::contains. Unfortunately, sometimes the MCRegister is compared to a Register which gave an ambiguous comparison error when the MCRegister is on the LHS. Adding a MCRegister==Register comparison operator created more ambiguous comparison errors elsewhere. These cases were usually comparing against a base or frame pointer register that is a physical register in a Register. For those I added an explicit conversion of Register to MCRegister to fix the error.	2025-02-20 23:44:05 -08:00
Christopher Di Bella	08c69b2ef6	Revert "[CodeGen] Remove static member function Register::isVirtualRegister. NFC (#127968 )" This reverts commit ff99af7ea03b3be46bec7203bd2b74048d29a52a.	2025-02-20 22:06:21 +00:00
Christopher Di Bella	309e3ca081	Revert "[CodeGen] Remove static member function Register::isPhysicalRegister. NFC" This reverts commit 5fadb3d680909ab30b37eb559f80046b5a17045e.	2025-02-20 22:06:21 +00:00
Craig Topper	5fadb3d680	[CodeGen] Remove static member function Register::isPhysicalRegister. NFC Prefer the nonstatic member by converting unsigned to Register instead.	2025-02-20 10:49:53 -08:00
Craig Topper	ff99af7ea0	[CodeGen] Remove static member function Register::isVirtualRegister. NFC (#127968 ) Use nonstatic member instead. This requires explicit conversions, but many will go away as we continue converting unsigned to Register. In a few places where it was simple, I changed unsigned to Register.	2025-02-20 08:35:50 -08:00
David Green	70ed381b16	[GlobalISel][AArch64] Fix fptoi.sat lowering. (#127901 ) The SDAG version uses fminnum/fmaxnum, in converting it to fcmp+select it appears the order of the operands was chosen badly. This switches the conditions used to keep the constant on the RHS.	2025-02-20 12:22:11 +00:00
Craig Topper	77183a46a5	[CodeGen] Remove static member function Register::virtReg2Index. NFC (#127962 ) Use the nonstatic member instead. I'm pretty sure the code in SPRIV is a layering violation. MC layer files are using a CodeGen header.	2025-02-19 23:34:55 -08:00
Piotr Fusik	8b58cb853a	[SelectionDAG][NFC] Refactor duplicate code into SDNode::bitcastToAPInt() (#127503 )	2025-02-20 13:23:00 +07:00
Craig Topper	92ddbbd89f	[CodeGen] Remove static member functions Register::stackSlot2Index/isStackSlot. NFC Migrate the few users to the nonstatic member functions.	2025-02-19 21:54:43 -08:00
Akshat Oke	557628dbe6	[CodeGen][NewPM] Port RegAllocPriorityAdvisor analysis to NPM (#118462 ) Similar to #117309. The advisor and logger are accessed through the provider, which is served by the new PM. Legacy PM forwards calls to the provider. New PM is a machine function analysis that lazily initializes the provider.	2025-02-20 09:35:49 +05:30
Matt Arsenault	37c341df28	Revert "AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711 )" This reverts commit 36eaf0daf5d6dd665d7c7a9ec38ea22f27709fed. This is not a sound approach to dealing with this instruction change. The new behavior is a different opcode pair, not a modifier on the existing opcode.	2025-02-20 10:19:14 +07:00
Benjamin Maxwell	f178e51747	[SDAG] Add missing ppc_fp128 ExpandFloatRes legalization for modf (#127895 ) Should fix: https://lab.llvm.org/buildbot/#/builders/72/builds/8380 (`test_modf_ppcf128` is the test case that needed the additional legalization)	2025-02-20 09:50:16 +07:00
Changpeng Fang	36eaf0daf5	AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711 ) For targets that support IEEE fminimum_num/fmaximum_num, the corresponding _min_num_fXY/_max_num_fXY instructions themselves already did the canonicalization for the inputs. As a result, we do not need to explicitly canonicalize the inputs for fminnum/fmaxnum.	2025-02-19 11:16:43 -08:00
Kazu Hirata	af922cf9f7	[CodeGen] Avoid repeated hash lookups (NFC) (#127745 )	2025-02-19 08:20:46 -08:00
Kazu Hirata	c23256ecbd	[AsmPrinter] Avoid repeated hash lookups (NFC) (#127744 )	2025-02-19 08:20:21 -08:00
zhijian lin	1ac0db44fd	[NFC] using isUndef() instead of getOpcode() == ISD::UNDEF (#127713 ) [NFC] using isUndef() instead of getOpcode() == ISD::UNDEF	2025-02-19 08:42:38 -05:00
Kazu Hirata	4405451a22	[AsmPrinter] Avoid repeated map lookups (NFC) (#127576 )	2025-02-18 08:39:14 -08:00
Akshat Oke	519b53e65e	[CodeGen][NewPM] Port RegAllocEvictionAdvisor analysis to NPM (#117309 ) Legacy pass used to provide the advisor, so this extracts that logic into a provider class used by both analysis passes. All three (Default, Release, Development) legacy passes `AdvisorAnalysis` are basically renamed to `AdvisorProvider`, so the actual legacy wrapper passes are `*AdvisorAnalysisLegacy`. There is only one NPM analysis `RegAllocEvictionAnalysis` that switches between the three providers in the `::run` method, to be cached by the NPM. Also adds `RequireAnalysis<RegAllocEvictionAnalysis>` to the optimized target reg alloc codegen builder.	2025-02-18 18:55:06 +07:00
James Chesterman	d4a0848dc6	[SelectionDAG] Add PARTIAL_REDUCE_U/SMLA ISD Nodes (#125207 ) Add signed and unsigned PARTIAL_REDUCE_MLA ISD nodes. Add command line argument (aarch64-enable-partial-reduce-nodes) that indicates whether the intrinsic experimental_vector_partial_ reduce_add will be transformed into the new ISD node. Lowering with the new ISD nodes will, for now, always be done as an expand.	2025-02-18 09:08:47 +00:00
Craig Topper	ef9f0b3c41	[DAGCombiner] Don't peek through truncates of shift amounts in takeInexpensiveLog2. (#126957 ) Shift amounts in SelectionDAG don't have to match the result type of the shift. SelectionDAGBuilder will aggressively truncate shift amounts to the target's preferred type. This may result in a zero extend that existed in IR being removed. If we look through a truncate here, we can't guarantee the upper bits of the truncate input are zero. There may have been a zext that was removed. Unfortunately, this regresses tests where no truncate was involved. The only way I can think to fix this is to add an assertzext when SelectionDAGBuilder truncates a shift amount or remove the early truncation of shift amounts from SelectionDAGBuilder all together. Fixes #126889.	2025-02-17 20:26:05 -08:00
Matt Arsenault	ed38d6702f	PeepholeOpt: Handle subregister compose when looking through reg_sequence (#127051 ) Previously this would give up on folding subregister copies through a reg_sequence if the input operand already had a subregister index. d246cc618adc52fdbd69d44a2a375c8af97b6106 stopped introducing these subregister uses, and this is the first step to lifting that restriction. I was expecting to be able to implement this only purely with compose / reverse compose, but I wasn't able to make it work so relies on testing the lanemasks for whether the copy reads a subset of the input.	2025-02-18 08:07:29 +07:00
Kazu Hirata	153dd19e30	[SelectionDAG] Remove lowerCallToExternalSymbol (#127408 ) The last use was removed in: commit 05e6bb40ebfd285cc87f7ce326b7ba76c3c7f870 Author: Roger Ferrer Ibáñez <rofirrim@gmail.com> Date: Thu May 30 14:55:32 2024 +0200	2025-02-17 00:06:48 -08:00
Kazu Hirata	0323554055	[GlobalISel] Avoid repeated hash lookups (NFC) (#127372 )	2025-02-16 08:15:36 -08:00
Csanád Hajdú	a190f15d2b	[AArch64] Add support for SHF_AARCH64_PURECODE ELF section flag (1/3) (#125687 ) Add support for the new SHF_AARCH64_PURECODE ELF section flag: https://github.com/ARM-software/abi-aa/pull/304 The general implementation follows the existing one for ARM targets. Generating object files with the `SHF_AARCH64_PURECODE` flag set is enabled by the `+execute-only` target feature. Related PRs: * Clang: https://github.com/llvm/llvm-project/pull/125688 * LLD: https://github.com/llvm/llvm-project/pull/125689	2025-02-14 08:56:07 +00:00
Michael Buch	41f96f91cd	[llvm][DebugInfo] Emit DW_AT_const_value for float non-type template parameters (#127045 ) In C++20, non-type template parameters can be float/double. Clang didn't emit those constants in DWARF. This patch emits floating point constants the same way we do other integral template value parameters.	2025-02-13 23:08:44 +00:00
Kazu Hirata	e7bf6a4e04	[CodeGen] Avoid repeated map lookups (NFC) (#127025 )	2025-02-13 09:11:17 -08:00
Matt Arsenault	43780f4f92	RegAllocGreedy: Use Register type	2025-02-13 20:49:27 +07:00
Cullen Rhodes	9b2fc66830	[SDAG] Harden assumption in getMemsetStringVal (#126207 ) In 5235973ee03aca4148ecabe5eff64da2af1e034e, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973ee03aca4148ecabe5eff64da2af1e034e to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 29200611055f49a0d37243caa5f8bba1df9d57a6, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: https://github.com/llvm/llvm-project/issues/20521 [1]	2025-02-13 08:48:06 +00:00
Cullen Rhodes	df62441336	[MISched][NFC] Remove unused heuristic NextDefUse from enum (#125879 ) Heuristic was removed in 46533e614b78 due to being ineffective.	2025-02-13 08:46:51 +00:00
Shubham Sandeep Rastogi	92f916faba	Add a pass to collect dropped var statistics for MIR (#126686 ) This patch attempts to reland https://github.com/llvm/llvm-project/pull/120780 while addressing the issues that caused the patch to be reverted. Namely: 1. The patch had included code from the llvm/Passes directory in the llvm/CodeGen directory. 2. The patch increased the backend compile time by 2% due to adding a very expensive include in MachineFunctionPass.h The patch has been re-structured so that there is no dependency between the llvm/Passes and llvm/CodeGen directory, by moving the base class, `class DroppedVariableStats` to the llvm/IR directory. The expensive include in MachineFunctionPass.h has been changed to contain forward declarations instead of other header includes which was pulling a ton of code into MachineFunctionPass.h and should resolve any issues when it comes to compile time increase.	2025-02-12 14:08:18 -08:00
Akshat Oke	7b60e03d73	Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703 )" (#126684 ) `RegisterClassInfo` was supposed to be kept alive between pass runs, which wasn't being done leading to recomputations increasing the compile time. Now the Impl class is a member of the legacy and new passes so that it is not reconstructed on every pass run. --------- Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>	2025-02-12 18:54:39 +05:30
David Green	bf7af2d12e	[AArch64][DAG] Allow fptos/ui.sat to scalarized. (#126799 ) We we previously running into problems with fp128 types and certain integer sizes. Fixes an issue reported on #124984	2025-02-12 11:04:08 +00:00
Craig Topper	7dd82805d5	[SelectionDAGBuilder] Remove NodeMap updates from getValueImpl. NFC (#126849 ) Both callers already put the result in NodeMap immediately after the call.	2025-02-12 00:07:07 -08:00
Haohai Wen	ec28e9b757	[MC] Replace MCContext::GenericSectionID with MCSection::NonUniqueID (#126202 ) They have same semantics. NonUniqueID is more friendly for isUnique implementation in MCSectionELF. History: 97837b7 added support for unique IDs in sections and added GenericSectionID. Later, 1dc16c7 added NonUniqueID.	2025-02-12 14:28:37 +08:00
Jim Lin	31bfae35d2	[DAGCombiner] Add hasOneUse checks for folding (not (add X, -1)) to (neg X) (#126667 ) To get more better codegen for AArch with bic, x86 with andn and riscv with andn.	2025-02-12 12:24:29 +08:00
Daniel Hoekwater	3a22cf9bd8	[CFIFixup] Fixup CFI for split functions with synchronous uwtables (#125299 ) - Precommit tests for synchronous uwtable CFI fixup - [CFIFixup] Fixup CFI for split functions with synchronous uwtables Commit `6e54fccede` disables CFI fixup for functions with synchronous tables, breaking CFI for split functions. Instead, we can disable block-level CFI fixup for functions with synchronous tables. Unwind tables can be: - N/A (not present) - Asynchronous - Synchronous Functions without unwind tables don't need CFI fixup (since they don't care about CFI). Functions with asynchronous unwind tables must be accurate for each basic block, so full CFI fixup is necessary. Functions with synchronous unwind tables only need to be accurate for each function (specifically, the portion of a function in a given section). Disabling CFI fixup entirely for functions with synchronous uwtables may break CFI for a function split between two sections. The portion in the first section may have valid CFI, while the portion in the second section is missing a call frame. Ex: ``` (.text.hot) Foo (BB1): <Call frame information> ... BB2: ... (.text.split) BB3: ... BB4: <epilogue> ``` Even if `Foo` has a synchronous unwind table, we still need to insert call frame information into `BB3` so that unwinding the call stack from `BB3` or `BB4` works properly.	2025-02-11 18:25:08 -05:00
Philip Reames	e4016bf5c3	[DAG] Use ArrayRef to simplify ShuffleVectorSDNode::isSplatMask	2025-02-11 12:47:10 -08:00
Benjamin Maxwell	19556eccf6	[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705 ) This makes the name more consistent with the other helpers.	2025-02-11 11:51:35 +00:00

1 2 3 4 5 ...

37268 Commits