llvm-project

Author	SHA1	Message	Date
Matt Arsenault	253ed52436	DAG: Use poison for some vector result widening (#168290 )	2025-11-19 16:49:43 -05:00
Matt Arsenault	a757c4e74e	CodeGen: Add subtarget to TargetLoweringBase constructor (#168620 ) Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.	2025-11-19 19:18:13 +00:00
Matt Arsenault	0b921f52cc	DAG: Use poison when splitting vector_shuffle results (#168176 )	2025-11-19 12:27:08 -05:00
Ryan Cowan	58e6d02aa2	[AArch64][GlobalISel] Check unmergeSrc is a vector in matchCombineBuildUnmerge (#168692 ) This aims to fix the crash in #168495, my combine rule was missing a check that the source vector was in fact a vector. This then caused the legality check to fail in this example as the concat was trying to concat a non vector. I have also gated the bitcast of the concat to only work on non-scalable vectors as the mutation calls `getNumElements` which crashes when called on a scalable vector. Fixes #168495	2025-11-19 12:30:51 +00:00
陈子昂	e38529ddbb	[DAG] Update canCreateUndefOrPoison to handle ISD::VECTOR_COMPRESS (#168010 ) Fixes #167710	2025-11-19 10:21:05 +00:00
Tom Tromey	1262acf4ec	Introduce DwarfUnit::addBlock helper method (#168446 ) This patch is just a small cleanup that unifies the various spots that add a DWARF expression to the output.	2025-11-18 22:59:36 +00:00
Craig Topper	1157a22134	[GISel] Use getScalarSizeInBits in LegalizerHelper::lowerBitCount (#168584 ) For vectors, CTLZ, CTTZ, CTPOP all operate on individual elements. The lowering should be based on the element width. I noticed this by inspection. No tests in tree are currently affected, but I thought it would be good to fix so someone doesn't have to debug it in the future.	2025-11-18 12:26:47 -08:00
Craig Topper	96e58b83a3	[RISCV] Legalize misaligned unmasked vp.load/vp.store to vle8/vse8. (#167745 ) If vector-unaligned-mem support is not enabled, we should not generate loads/stores that are not aligned to their element size. We already do this for non-VP vector loads/stores. This code has been in our downstream for about a year and a half after finding the vectorizer generating misaligned loads/stores. I don't think that is unique to our downstream. Doing this for masked vp.load/store requires widening the mask as well which is harder to do. NOTE: Because we have to scale the VL, this will introduce additional vsetvli and the VL optimizer will not be effective at optimizing any arithmetic that is consumed by the store.	2025-11-18 11:13:54 -08:00
Hongyu Chen	523bd2df6d	[GISel][RISCV] Compute CTPOP of small odd-sized integer correctly (#168559 ) Fixes the assertion in #168523 This patch lifts the small, odd-sized integer to 8 bits, ensuring that the following lowering code behaves correctly.	2025-11-18 18:49:13 +00:00
Nathan Corbyn	93a8ca8fc7	[AArch64][GISel] Don't crash in known-bits when copying from vectors to non-vectors (#168081 ) Updates the demanded elements before recursing through copies in case the type of the source register changes from a non-vector register to a vector register. Fixes #167842.	2025-11-18 16:42:58 +00:00
Hassnaa Hamdi	3d5d32c605	[CGP]: Optimize mul.overflow. (#148343 ) - Detect cases where LHS & RHS values will not cause overflow (when the Hi halfs are zero).	2025-11-18 13:15:47 +00:00
David Green	4ecfaa602f	[AArch64][GlobalISel] Add better basic legalization for llround. (#168427 ) This adds handling for f16 and f128 lround/llround under LP64 targets, promoting the f16 where needed and using a libcall for f128. This codegen is now identical to the selection dag version.	2025-11-18 12:05:02 +00:00
Sander de Smalen	f369a53d82	[DAGCombiner] Fold select into partial.reduce.add operands. (#167857 ) This generates more optimal codegen when using partial reductions with predication. ``` partial_reduce_mla(acc, sel(p, mul(ext(a), ext(b)), splat(0)), splat(1)) -> partial_reduce_mla(acc, sel(p, a, splat(0)), b) partial.reduce.mla(acc, sel(p, ext(op), splat(0)), splat(1)) -> partial.reduce.*mla(acc, sel(p, op, splat(0)), splat(trunc(1))) ```	2025-11-18 09:49:42 +00:00
Aiden Grossman	472e4ab0b0	[MLGO] Fully Remove MLRegalloc Experimental Features (#168252 ) 20a22a45e96bc94c3a8295cccc9031bd87552725 was supposed to fully remove these, but left around the functionality to actually compute them and a unittest that ensured they worked. These are not development features in the sense of features used in development mode, but experimental features that have been superseded by MIR2Vec.	2025-11-17 10:07:48 -08:00
Ryan Cowan	d65be16ab6	[AArch64][GlobalISel] Add combine for build_vector(unmerge, unmerge, undef, undef) (#165539 ) This PR adds a new combine to the `post-legalizer-combiner` pass. The new combine checks for vectors being unmerged and subsequently padded with `G_IMPLICIT_DEF` values by building a new vector. If such a case is found, the vector being unmerged is instead just concatenated with a `G_IMPLICIT_DEF` that is as wide as the vector being unmerged. This removes unnecessary `mov` instructions in a few places.	2025-11-17 15:55:40 +00:00
David Green	22968f5b4a	[DAG] Add strictfp implicit def reg after metadata. (#168282 ) This prevents a machine verifier error, where it "Expected implicit register after groups". Fixes #158661	2025-11-17 10:57:21 +00:00
Abinaya Saravanan	c946418330	[MachinePipeliner] Detect a cycle in PHI dependencies early on (#167095 ) - This patch detects cycles by phis and bails out if one is found. - It prevents to violate DAG restrictions. Abort pipelining in the below case %1 = phi i32 [ %a, %entry ], [ %3, %loop ] %2 = phi i32 [ %a, %entry ], [ %1, %loop ] %3 = phi i32 [ %b, %entry ], [ %2, %loop ] --------- Co-authored-by: Ryotaro Kasuga <kasuga.ryotaro@fujitsu.com>	2025-11-17 15:28:30 +05:30
pvanhout	853ed3b3b7	[InlineAsmLowering] unsigned -> TypeSize for getTypeStoreSize result	2025-11-17 10:21:43 +01:00
hstk30-hw	51c8180515	[GlobalMerge]Prefer use global-merge-max-offset instead of the target-specific constant offset. (#165591 ) In the Dhrystone benchmark, I find some adjacent global not be merged, on the contrary the GCC's anchor optimize is work. Use global-merge-max-offset to set the max offset can yield similar results (still slightly different, at least we can control the offset).	2025-11-17 15:37:51 +08:00
ronlieb	6d5f87fc42	Revert "DAG: Allow select ptr combine for non-0 address spaces" (#168292 ) Reverts llvm/llvm-project#167909	2025-11-16 18:35:51 -05:00
Kazu Hirata	98d49d51c0	[CodeGen] Remove a redundant declaration (NFC) (#168285 ) EnableFSDiscriminator is declared in DebugInfoMetadata.h. Identified with readability-redundant-declaration.	2025-11-16 14:06:18 -08:00
Matt Arsenault	dd9bd3e8f0	DAG: Preserve poison in combineConcatVectorOfScalars (#168220 )	2025-11-16 11:16:34 -08:00
Sergei Barannikov	97a60aa37a	[CodeGen] Turn MCRegUnit into an enum class (NFC) (#167943 ) This changes `MCRegUnit` type from `unsigned` to `enum class : unsigned` and inserts necessary casts. The added `MCRegUnitToIndex` functor is used with `SparseSet`, `SparseMultiSet` and `IndexedMap` in a few places. `MCRegUnit` is opaque to users, so it didn't seem worth making it a full-fledged class like `Register`. Static type checking has detected one issue in `PrologueEpilogueInserter.cpp`, where `BitVector` created for `MCRegister` is indexed by both `MCRegister` and `MCRegUnit`. The number of casts could be reduced by using `IndexedMap` in more places and/or adding a `BitVector` adaptor, but the number of casts per file is still small and `IndexedMap` has limitations, so it didn't seem worth the effort. Pull Request: https://github.com/llvm/llvm-project/pull/167943	2025-11-16 20:46:44 +03:00
Sergei Barannikov	e413343ca7	[SelectionDAG] Verify SDTCisVT and SDTCVecEltisVT constraints (#150125 ) Teach `SDNodeInfoEmitter` TableGen backend to process `SDTypeConstraint` records and emit tables for them. The tables are used by `SDNodeInfo::verifyNode()` to validate a node being created. This PR only adds validation code for `SDTCisVT` and `SDTCVecEltisVT` constraints to keep it smaller. Pull Request: https://github.com/llvm/llvm-project/pull/150125	2025-11-16 18:26:03 +03:00
AZero13	d831f8df52	[SelectionDAG] Fix AArch64 machine verifier bug when expanding LOOP_DEPENDENCE_MASK (#168221 ) TargetConstant nodes don't match TableGen ImmLeaf patterns during instruction selection. When this zero constant flows into the AArch64 CCMP formation code, the machine verifier hits an assertion in expensive checks. Fixes: #168227	2025-11-15 21:12:11 +00:00
Austin	700aa5e376	[revert][CodeGen] add a command to force global merge (#168230 ) sorry, this was my mistake	2025-11-16 03:40:07 +08:00
Austin	3705921f60	[CodeGen] add a command to force global merge I found that in some performance scenarios, such as under O2, this pr can be helpful for a series of loading global variables.	2025-11-16 03:20:27 +08:00
Matt Arsenault	70349c17d3	DAG: Use poison in SplitVecRes_VP_LOAD_FF (#167753 )	2025-11-15 08:48:36 -08:00
Matt Arsenault	33a7bb1f1a	DAG: Use poison when legalizing scalar_to_vector results (#167751 )	2025-11-15 08:47:08 -08:00
Ryan Cowan	f8d65fd874	[AArch64][GlobalISel] Improve lowering of vector fp16 fpext (#165554 ) This PR improves the lowering of vectors of fp16 when using fpext. Previously vectors of fp16 were scalarized leading to lots of extra instructions. Now, vectors of fp16 will be lowered when extended to fp64 via the preexisting lowering logic for extends. To make use of the existing logic, we need to add elements until we reach the next power of 2.	2025-11-14 20:52:51 -08:00
Mikołaj Piróg	e7b41df10e	[SelectionDAGBuilder] Propagate fast-math flags to fpext (#167574 ) As in title. Without this, fpext behaves in selectionDAG as always having no fast-math flags.	2025-11-14 20:50:59 -08:00
Craig Topper	5442aa1853	[RDF] Rename RegisterId field in RegisterRef Reg->Id. NFC (#168154 ) Not all RegisterId values are registers, so Id is a more appropriate name. Use asMCReg() in some places that assumed it was a register.	2025-11-14 18:33:50 -08:00
Sergei Barannikov	4eea157301	[GlobalISel] Return byte offsets from computeValueLLTs (NFC) (#166747 ) To avoid scaling offsets back and forth. This is also what SelectionDAG equivalent (ComputeValueVTs) does, and will allow to reuse ComputeValueTypes with less effort.	2025-11-15 00:23:26 +00:00
Matt Arsenault	862d34666f	opt: Fix bad merge of #167996 (#168110 ) After the base branch was moved to main, this somehow ended up adding a second definition of RTLCI, instead of modifying the existing one. Also fix other build error with gcc bots.	2025-11-14 12:03:26 -08:00
Matt Arsenault	590ab43e8a	RuntimeLibcalls: Move VectorLibrary handling into TargetOptions (#167996 ) This fixes the -fveclib flag getting lost on its way to the backend. Previously this was its own cl::opt with a random boolean. Move the flag handling into CommandFlags with other backend ABI-ish options, and have clang directly set it, rather than forcing it to go through command line parsing. Prior to de68181d7f, codegen used TargetLibraryInfo to find the vector function. Clang has special handling for TargetLibraryInfo, where it would directly construct one with the vector library in the pass pipeline. RuntimeLibcallsInfo currently is not used as an analysis in codegen, and needs to know the vector library when constructed. RuntimeLibraryAnalysis could follow the same trick that TargetLibraryInfo is using in the future, but a lot more boilerplate changes are needed to thread that analysis through codegen. Ideally this would come from an IR module flag, and nothing would be in TargetOptions. For now, it's better for all of these sorts of controls to be consistent.	2025-11-14 11:19:21 -08:00
Craig Topper	7108b12f6b	[RDF] RegisterRef/RegisterId improvements. NFC (#168030 ) RegisterId can represent a physical register, a MCRegUnit, or an index into a side structure that stores register masks. These 3 types were encoded by using the physical reg, stack slot, and virtual register encoding partitions from the Register class. This encoding scheme alias wasn't well contained so Register::index2StackSlot and Register::stackSlotIndex appeared in multiple places. This patch gives RegisterRef its own encoding defines and separates it from Register. I've removed the generic idx() method in favor of getAsMCReg(), getAsMCRegUnit(), and getMaskIdx() for some degree of type safety. Some places used the RegisterId field of RegisterRef directly as a register. Those have been updated to use getAsMCReg. Some special cases for RegisterId 0 have been removed as it can be treated like a MCRegister by existing code. I think I want to rename the Reg field of RegisterRef to Id, but I'll do that in another patch. Additionally, callers of the RegisterRef constructor need to be audited for implicit conversions from Register/MCRegister to unsigned.	2025-11-14 10:30:25 -08:00
Pierre van Houtryve	31b7f1fa0b	[GlobalISel] Add support for value/constants as inline asm memory operand (#161501 ) InlineAsmLowering rejected inline assembly with memory reference inputs if the values passed to the inline asm weren't pointers. The DAG lowering however handled them just fine. This patch updates InlineAsmLowering to store such values on the stack, and then use the stack pointer as the "indirect" version of the operand.	2025-11-14 10:34:38 +01:00
Craig Topper	388ef61250	[RegAllocGreedy] Use MCRegister instead of MCPhysReg. NFC (#167974 )	2025-11-13 23:26:35 +00:00
Sergei Barannikov	12edc56f2b	[RegAllocFast] Add helper methods for getting/setting regunit state(NFC) (#167931 ) The methods will help reduce the number of static_casts after changing MCRegUnit to a strong typedef.	2025-11-13 19:34:37 +00:00
Sergei Barannikov	0b5f38894a	[CodeGen] Use VirtRegOrUnit/MCRegUnit in MachineTraceMetrics (NFC) (#167859 )	2025-11-13 19:10:41 +00:00
Matt Arsenault	e5f499f48f	DAG: Allow select ptr combine for non-0 address spaces (#167909 )	2025-11-13 18:58:08 +00:00
Sergei Barannikov	d1cc1376a0	[CodeGen] Add TRI::regunits() iterating over all register units (NFC) (#167901 )	2025-11-13 17:27:35 +00:00
Craig Topper	8d6a1def4d	[SelectionDAGISel] Don't merge input chains if it would put a token factor in the way of a glue. (#167805 ) In the new test, we're trying to fold a load and a X86ISD::CALL. The call has a CopyToReg glued to it. The load and the call have different input chains so they need to be merged. This results in a TokenFactor that gets put between the CopyToReg and the final CALLm instruction. The DAG scheduler can't handle that. The load here was created by legalization of the extract_element using a stack temporary store and load. A normal IR load would be chained into call sequence by SelectionDAGBuilder. This would usually have the load chained in before the CopyToReg. The store/load created by legalization don't get chained into the rest of the DAG. Fixes #63790	2025-11-13 09:25:53 -08:00
Sergei Barannikov	98f9b54376	[CodeGen] Hide SparseSet<LiveRegUnit> behind a typedef (NFC) (#167898 ) So that changing the type of the container (planned in a future patch) is less intrusive.	2025-11-13 20:14:23 +03:00
Craig Topper	0acdbd5d81	[InstrRef] Consistently use MLocTracker::getLocID() before calling lookupOrTrackRegister (#167841 ) The LocID for registers is just the register ID. The getLocID function is supposed to hide this detail, but it wasn't being used consistently. This avoids a bunch of implicit casts from Register or MCRegister to unsigned.	2025-11-13 08:38:35 -08:00
Tomer Shafir	35ffe10349	[opt] Add --save-stats option (#167304 ) This patch adds a Clang-compatible --save-stats option to opt, to provide an easy to use way to save LLVM statistics files when working with opt on the middle end. This is a follow up on the addition to `llc`: https://github.com/llvm/llvm-project/pull/163967 Like on Clang, one can specify --save-stats, --save-stats=cwd, and --save-stats=obj with the same semantics and JSON format. The pre-existing --stats option is not affected. The implementation extracts the flag and its methods into the common `CodeGen/CommandFlags` as `LLVM_ABI`, using a new registration class to conservatively enable opt-in rather than let all tools take it. Its only needed for llc and opt for now. Then it refactors llc and adds support for opt.	2025-11-13 16:03:28 +02:00
Simon Pilgrim	a5342d5fe5	Revert "[DAG] Fold (umin (sub a b) a) -> (usubo a b); (select usubo.1 a usubo.0)" (#167854 ) Reverts llvm/llvm-project#161651 due to downstream bad codegen reports	2025-11-13 10:46:38 +00:00
Sergei Barannikov	ef9a02ce02	[CodeGen] Use VirtRegOrUnit where appropriate (NFCI) (#167730 ) Use it in `printVRegOrUnit()`, `getPressureSets()`/`PSetIterator`, and in functions/classes dealing with register pressure. Static type checking revealed several bugs, mainly in MachinePipeliner. I'm not very familiar with this pass, so I left a bunch of FIXMEs. There is one bug in `findUseBetween()` in RegisterPressure.cpp, also annotated with a FIXME.	2025-11-13 10:26:58 +00:00
Craig Topper	99a726ea51	[SelectionDAGISel] Const correct ChainNodesMatched argument to HandleMergeInputChains. NFC (#167807 )	2025-11-12 22:56:57 -08:00
Matt Arsenault	2e489f77ba	CodeGen: Fix CodeView crashes with empty llvm.dbg.cu (#163286 )	2025-11-12 14:59:42 -08:00

1 2 3 4 5 ...

38715 Commits