llvm-project

Author	SHA1	Message	Date
Kazu Hirata	9fecb4f907	[CodeGen] Fix a warning This patch fixes: llvm/lib/CodeGen/MachineSink.cpp:1667:22: error: unused variable 'Preheader' [-Werror,-Wunused-variable]	2025-01-23 19:37:28 -08:00
Matt Arsenault	0ef39a882b	MachineCSE: Remove check for subreg on a def operand (#124095 ) There are no subregister defs in SSA.	2025-01-24 09:35:30 +07:00
Jeffrey Byrnes	acb7859f07	[MachineSink] Extend loop sinking capability (#117247 ) The current MIR cycle sinking capabilities are rather limited. It only support sinking copies into a single successor block while obeying limits. This opt-in feature adds a more aggressive option, that is not limited to the above concerns. The feature will try to "sink" by duplicating any top-level preheader instruction (that we are sure is safe to sink) into any user block, then does some dead code cleanup. In particular, this is useful for high RP situations when loop bodies have control flow.	2025-01-23 17:08:23 -08:00
Min-Yih Hsu	bc74a1edbe	[IA] Generalize the support for power-of-two (de)interleave intrinsics (#123863 ) Previously, AArch64 used pattern matching to support llvm.vector.(de)interleave of 2 and 4; RISC-V only supported (de)interleave of 2. This patch consolidates the logics in these two targets by factoring out the common factor calculations into the InterleaveAccess Pass.	2025-01-23 15:27:51 -08:00
Jeffrey Byrnes	f2942b9077	[CodeGen] NFC: Move isDead to MachineInstr (#123531 ) Provide isDead interface for access to ad-hoc isDead queries. LivePhysRegs is optional: if not provided, pessimistically check deadness of a single MI without doing the LivePhysReg walk; if provided it is assumed to be at the position of MI.	2025-01-23 12:54:29 -08:00
Craig Topper	e30a4fc3e2	[TargetLowering] Improve one signature of forceExpandWideMUL. (#123991 ) We have two forceExpandWideMUL functions. One takes the low and high half of 2 inputs and calculates the low and high half of their product. This does not calculate the full 2x width product. The other signature takes 2 inputs and calculates the low and high half of their full 2x width product. Previously it did this by sign/zero extending the inputs to create the high bits and then calling the other function. We can instead copy the algorithm from the other function and use the Signed flag to determine whether we should do SRA or SRL. This avoids the need to multiply the high part of the inputs and add them to the high half of the result. This improves the generated code for signed multiplication. This should improve the performance of #123262. I don't know yet how close we will get to gcc.	2025-01-23 12:49:35 -08:00
Florian Hahn	0d0190815d	[TailDup] Allow large number of predecessors/successors without phis. (#116072 ) This adjusts the threshold logic added in #78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors. In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions. This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs. PR: https://github.com/llvm/llvm-project/pull/116072	2025-01-23 18:24:20 +00:00
Kazu Hirata	bb019dd165	[CodeGen] Avoid repeated hash lookups (NFC) (#124078 )	2025-01-23 08:46:19 -08:00
Michael Maitland	7db4ba3916	[GlobalMerge][NFC] Fix inaccurate comments (#124136 ) I was studying the code here and realized that the comments were talking about grouping by basic blocks when the code was grouping by Function. Fix the comments so they reflect what the code is actually doing.	2025-01-23 11:36:53 -05:00
Matt Arsenault	fb3fa41aee	MachineRegisterInfo: Use variable for TRI	2025-01-23 20:29:25 +07:00
Jeremy Morse	cb714e74cc	[DebugInfo][InstrRef] Avoid producing broken DW_OP_deref_sizes (#123967 ) We use variable locations such as DBG_VALUE $xmm0 as shorthand to refer to "the low lane of $xmm0", and this is reflected in how DWARF is interpreted too. However InstrRefBasedLDV tries to be smart and interprets such a DBG_VALUE as a 128-bit reference. We then issue a DW_OP_deref_size of 128 bits to the stack, which isn't permitted by DWARF (it's larger than a pointer). Solve this for now by not using DW_OP_deref_size if it would be illegal. Instead we'll use DW_OP_deref, and the consumer will load the variable type from the stack, which should be correct. There's still a risk of imprecision when LLVM decides to use smaller or larger value types than the source-variable type, which manifests as too-little or too-much memory being read from the stack. However we can't solve that without putting more type information in debug-info. fixes #64093	2025-01-23 10:47:15 +00:00
Mats Jun Larsen	d7c14c8f97	[IR] Replace of PointerType::getUnqual(Type) with opaque version (NFC) (#123909 ) Follow up to https://github.com/llvm/llvm-project/issues/123569	2025-01-23 18:23:05 +09:00
Benjamin Maxwell	778138114e	[SDAG] Use BatchAAResults for querying alias analysis (AA) results (#123934 ) Once we get to SelectionDAG the IR should not be changing anymore, so we can use BatchAAResults rather than AAResults to cache AA queries. This should be a NFC change for targets that enable AA during codegen (such as AArch64), but also give a nice compile-time improvement in some cases. See: https://github.com/llvm/llvm-project/pull/123787#issuecomment-2606797041 Note: This follows Nikita's suggestion on #123787.	2025-01-23 09:16:09 +00:00
Alan Li	220004d2f8	[GISel] Add more FP opcodes to CSE (#123949 ) Resubmit, previously PR has compilation issues.	2025-01-22 23:00:08 -08:00
Mingming Liu	de209fa11b	[CodeGen] Introduce Static Data Splitter pass (#122183 ) https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections. This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR https://github.com/llvm/llvm-project/pull/122215 implements such changes. --------- Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-01-22 21:06:46 -08:00
Matt Arsenault	15c2d4baf1	PeepholeOpt: Remove check for subreg index on a def operand (#123943 ) This is looking at operand 0 of a REG_SEQUENCE, which can never have a subregister index.	2025-01-23 09:06:26 +07:00
Matt Arsenault	2646e2d487	PeepholeOpt: Stop allocating tiny helper classes (NFC) (#123936 ) This was allocating tiny helper classes for every instruction visited. We can just dispatch over the cases in the visitor function instead.	2025-01-23 09:00:08 +07:00
Matt Arsenault	6f69adeed6	PeepholeOpt: Remove null TargetRegisterInfo check (#123933 ) This cannot happen. Also simplify the LaneBitmask check from !none to any.	2025-01-23 08:57:04 +07:00
Matt Arsenault	23d2a1862a	PeepholeOpt: Remove unnecessary check for null TargetInstrInfo (#123929 ) This can never happen.	2025-01-23 08:46:59 +07:00
Hua Tian	a9d2834508	[llvm][CodeGen] Fix the issue caused by live interval checking in window scheduler (#123184 ) At some corner cases, the cloned MI still retains an old slot index, which leads to the compiler crashing. This patch update the slot index map before delete the recycled MI. https://github.com/llvm/llvm-project/issues/123165	2025-01-23 09:39:03 +08:00
Ellis Hoag	b1943f40e7	[BranchFolding] Remove getBranchDebugLoc() (#114613 )	2025-01-22 09:50:49 -08:00
Craig Topper	9e6494c0fb	[CodeGen] Rename RegisterMaskPair to VRegMaskOrUnit. NFC (#123799 ) This holds a physical register unit or virtual register and mask. While I was here I've used emplace_back and removed an unneeded use of a template.	2025-01-22 09:11:22 -08:00
Danial Klimkin	c938436f71	Revert "[GISel] Add more FP opcodes to CSE (#123624 )" (#123954 ) This reverts commit 43177b524ee06dfc09cbc357ff277d4f53f5dc15.	2025-01-22 16:21:05 +01:00
lialan	43177b524e	[GISel] Add more FP opcodes to CSE (#123624 ) This fixes #122724	2025-01-22 06:20:42 -08:00
Sander de Smalen	6b1db79887	Revert "Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632 )" There's a regression with one of the bootstrap builds for x86. I'll revert this while I investigate. This reverts commit 4df6d3df24ae9cff07c70c96a1663cbba6e1dca5.	2025-01-22 10:11:32 +00:00
Sander de Smalen	4df6d3df24	Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632 ) This PR aims to reland work done by @arsenm which was previously reverted due to some tangentially related scheduler issues as discussed on #76416. This PR cherry-picks the original commit (0e46b49de433), and adds another patch on top with the following changes: * The code in `updateRegDefsUses` now updates subranges when subreg-liveness-tracking is enabled. * When adding an implicit-def operand for the super-register, the code in `reMaterializeTrivialDef` which tries to remove undefined subranges should now take into account that the lanes from the super-reg are no longer undefined. Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2025-01-22 09:07:46 +00:00
Elizaveta Noskova	3088c31699	[llvm] Add NCD search on Array of basic blocks (NFC) (#119355 ) Shrink-Wrap points split Part 2. RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581 Part 1: https://github.com/llvm/llvm-project/pull/117862 Part 3: https://github.com/llvm/llvm-project/pull/119357 Part 4: https://github.com/llvm/llvm-project/pull/119358 Part 5: https://github.com/llvm/llvm-project/pull/119359	2025-01-22 11:55:02 +03:00
Kazu Hirata	19a7fe03b4	[CodeGen] Avoid repeated hash lookups (NFC) (#123894 )	2025-01-22 00:17:55 -08:00
Eli Friedman	d540ebf6cb	[ARM64EC] Avoid emitting unnecessary symbol references with /guard:cf. (#123235 ) .gfids$y contains a list of indirect calls for Control Flow Guard. This wasn't working properly for ARM64EC: direct calls were being treated as indirect calls. Make sure we correctly filter out direct calls. This improves the protection from Control Flow Guard, and also fixes a link error when using certain functions from oldnames.lib.	2025-01-21 16:29:23 -08:00
Jason Eckhardt	7cf8addc2d	[TLOF][NFC] Make emitLinkerDirectives virtual and public. (#123773 ) Today, emitLinkerDirectives is private to TLOFCOFF-- it isolates parsing and processing of the linker options. Similar processing is also done by other TLOFs inline within emitModuleMetadata. This patch promotes emitLinkerDirectives to a virtual (public) method so that this handling is similarly isolated in the other TLOFs. This also enables downstream targets to override just this handling instead of the whole of emitModuleMetadata.	2025-01-21 18:24:33 -06:00
Vinicius Tadeu Zein	6ab9dafec8	[clang] Implement #pragma clang section on COFF targets (#112714 ) This patch implements the directive #pragma clang section on COFF targets with the exact same features available on ELF and Mach-O.	2025-01-21 16:12:58 -08:00
Craig Topper	cdd321462a	[TargetLowering] Use getShiftAmountConstant. NFC (#123802 ) Previously we always used the pointer size which might need to be legalized on some targets.	2025-01-21 12:05:52 -08:00
Matt Arsenault	5e79ae60a6	DAG: Fix vector_shuffle -> splat fold defining undef lanes (#123596 ) For shuffle vector splats with undef lanes in the mask, this was introducing real values. Filter out build_vector results based on the undef elements in the mask. This avoids AMDGPU test regressions in a future change. test/CodeGen/X86/urem-seteq-illegal-types.ll looks worse but I didn't investigate.	2025-01-21 23:55:50 +07:00
Craig Topper	f5f32cef61	[CodeGen] Use MCRegister instead of MCPhysReg in RegisterMaskPair. NFC (#123688 ) Update some other places to avoid implicit conversions this introduces, but I probably missed some.	2025-01-21 07:04:35 -08:00
Craig Topper	c3d820553f	[RegAllocFast] Don't convert MCRegUnit to MCRegister. NFC (#123705 )	2025-01-21 07:03:23 -08:00
lialan	5d9c717597	[GISel] Fold shifts to constant result. (#123510 ) This resolves #123212	2025-01-21 05:10:45 -08:00
David Sherwood	50bfa85d79	[DAGCombiner] Fix scalarizeExtractedBinOp for some SETCC cases (#123071 ) PR https://github.com/llvm/llvm-project/pull/118823 added a DAG combine for extracting elements of a vector returned from SETCC, however it doesn't correctly deal with the case where the vector element type is not i1. In this case we have to take account of the boolean contents, which are represented differently between vectors and scalars. The code now explicitly performs an inreg sign extend in order to get the same result. Fixes https://github.com/llvm/llvm-project/issues/121372	2025-01-21 10:31:56 +00:00
Kazu Hirata	a588e20280	[SelectionDAG] Avoid repeated hash lookups (NFC) (#123697 )	2025-01-21 16:24:49 +08:00
Mikhail Gudim	5cde6d2fdf	[ReachingDefAnalysis][NFC] Replace MCRegister with Register (#123626 ) This is preparation for extending ReachingDefAnalysis to stack slots. We should use `Register`, not `MCRegister` for something that can be a physical register or a stack slot.	2025-01-21 01:04:18 -05:00
Craig Topper	1434313bd8	[LiveRegMatrix] Use MCRegUnit instead of MCRegister for register unit. NFC MCRegister should be used for registers, not register units.	2025-01-20 10:57:34 -08:00
Kazu Hirata	efae9f3c21	[MIRParser] Avoid repeated map lookups (NFC) (#123561 )	2025-01-20 10:15:27 -08:00
Kazu Hirata	bc1e699d9f	[CodeGen] Avoid repeated hash lookups (NFC) (#123557 )	2025-01-20 10:13:08 -08:00
Alex MacLean	3606876b67	[SDAG] Fix CSE for ADDRSPACECAST nodes (#122912 ) Correct CSE in SelectionDAG can make DAG combining more effective and reduces the size of the DAG and thus should improve compile time.	2025-01-20 09:09:22 -08:00
Mats Jun Larsen	416f1c465d	[IR] Replace of PointerType::get(Type) with opaque version (NFC) (#123617 ) In accordance with https://github.com/llvm/llvm-project/issues/123569 In order to keep the patch at reasonable size, this PR only covers for the llvm subproject, unittests excluded.	2025-01-21 00:32:56 +09:00
Graham Hunter	d9f165ddea	[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810 ) Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder. The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.	2025-01-20 12:57:05 +00:00
Akshat Oke	3ace18d5c0	[CodeGen] MachineFunctionSplitter: Add missing initializer (#123564 ) This registers the pass with PassRegistry so we can use -start-before and other options for machine-function-splitter.	2025-01-20 16:56:46 +05:30
yingopq	754ed95b66	[Mips] Fix compiler crash when returning fp128 after calling a functi… (#117525 ) …on returning { i8, i128 } Fixes https://github.com/llvm/llvm-project/issues/96432.	2025-01-20 16:47:40 +08:00
Hervé Poussineau	be68f35bf5	[MC][CodeGen][Mips] Add CodeView mapping (#120877 ) Also add support for new relocation types required by debug information. Constants have been taken from CodeView Symbolic Debug Information Specification.	2025-01-20 15:00:24 +08:00
Craig Topper	b7eee2c3fe	[CodeGen] Remove some implict conversions of MCRegister to unsigned by using(). NFC Many of these are indexing BitVectors or something where we can't using MCRegister and need the register number.	2025-01-19 13:18:04 -08:00
Kazu Hirata	3d15bfb40c	[CodeGen] Avoid repeated hash lookups (NFC) (#123500 )	2025-01-19 10:57:25 -08:00

1 2 3 4 5 ...

37098 Commits