llvm-project

Author	SHA1	Message	Date
Fabian Ritter	8adcc8a669	[SelectionDAG] Introduce ISD::PTRADD (#140017 ) This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith(). The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree. Original authors: @davidchisnall, @jrtc27, @arichardson. The changes in this PR were extracted from PR #105669. --------- Co-authored-by: David Chisnall <github@theravensnest.org> Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com> Co-authored-by: Alexander Richardson <alexrichardson@google.com> Co-authored-by: Rodolfo Wottrich <rodolfo.wottrich@arm.com>	2025-05-28 09:09:17 +02:00
Kerry McLaughlin	b61144bf77	[AArch64] Allow lowering of more types to GET_ACTIVE_LANE_MASK (#140062 ) Adds support for operand promotion and splitting/widening the result of the ISD::GET_ACTIVE_LANE_MASK node. For AArch64, shouldExpandGetActiveLaneMask now returns false for more types which we know can be legalised.	2025-05-27 11:21:57 +01:00
Jon Roelofs	714096c132	[LLVM] Skip dumping inline SDag children (#141359 ) If they're simple enough to render inline, we don't need to dump them again in the recursive walk.	2025-05-26 19:40:01 -07:00
Luke Lau	3033f202f6	[IR] Add llvm.vector.[de]interleave{4,6,8} (#139893 ) This adds [de]interleave intrinsics for factors of 4,6,8, so that every interleaved memory operation supported by the in-tree targets can be represented by a single intrinsic. For context, [de]interleaves of fixed-length vectors are represented by a series of shufflevectors. The intrinsics are needed for scalable vectors, and we don't currently scalably vectorize all possible factors of interleave groups supported by RISC-V/AArch64. The underlying reason for this is that higher factors are currently represented by interleaving multiple interleaves themselves, which made sense at the time in the discussion in https://github.com/llvm/llvm-project/pull/89018. But after trying to integrate these for higher factors on RISC-V I think we should revisit this design choice: - Matching these in InterleavedAccessPass is non-trivial: We currently only support factors that are a power of 2, and detecting this requires a good chunk of code - The shufflevector masks used for [de]interleaves of fixed-length vectors are much easier to pattern match as they are strided patterns, but for the intrinsics it's much more complicated to match as the structure is a tree. - Unlike shufflevectors, there's no optimisation that happens on [de]interleave2 intriniscs - For non-power-of-2 factors e.g. 6, there are multiple possible ways a [de]interleave could be represented, see the discussion in #139373 - We already have intrinsics for 2,3,5 and 7, so by avoiding 4,6 and 8 we're not really saving much By representing these higher factors are interleaved-interleaves, we can in theory support arbitrarily high interleave factors. However I'm not sure this is actually needed in practice: SVE only has instructions for factors 2,3,4, whilst RVV only supports up to factor 8. This patch would make it much easier to support scalable interleaved accesses in the loop vectorizer for RISC-V for factors 3,5,6 and 7, as the loop vectorizer and InterleavedAccessPass wouldn't need to construct and match trees of interleaves. For interleave factors above 8, for which there are no hardware memory operations to match in the InterleavedAccessPass, we can still keep the wide load + recursive interleaving in the loop vectorizer.	2025-05-26 18:45:12 +01:00
Jon Roelofs	346a72f2ca	[LLVM] Add color to SDNode ID's when dumping (#141295 ) This is especially helpful for the recursive 'Cannot select:' dumps, where colors help distinguish nodes at a quick glance.	2025-05-24 09:40:29 -07:00
Kazu Hirata	3bc174ba77	[CodeGen] Remove unused includes (NFC) (#141320 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-24 00:00:00 -07:00
Tim Gymnich	760bf4f116	[GISel] Add KnownFPClass Analysis to GISelValueTrackingPass (#134611 ) - add KnownFPClass analysis to GISelValueTrackingPass - add MI pattern for `m_GIsFPClass`	2025-05-23 14:38:51 +02:00
Craig Topper	c432936b05	[SelectionDAG][RISCV] Use VP_STORE to widen MSTORE in type legalization when possible. (#140991 ) Widening the mask and padding with zeros doesn't work for scalable vectors. Using VL produces less code for fixed vectors. Similar was recently done for MLOAD.	2025-05-22 08:28:42 -07:00
Rahul Joshi	1fdf02ad5a	[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002 ) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.	2025-05-22 08:07:52 -07:00
Pierre van Houtryve	b5e2a236b9	[CodeGen] Add SSID & Atomic Ordering to IntrinsicInfo (#140896 ) getTgtMemIntrinsic should be able to propagate such information to the MMO	2025-05-22 11:42:01 +02:00
Craig Topper	0a42db682a	[SelectionDAG] Simplify creation of getStoreVP in WidenVecOp_STORE. NFC We can use the offset from the original store instead of creating a new undef offset. We didn't check if the offset was undef already so we really shouldn't drop it if it isn't.	2025-05-21 17:46:08 -07:00
Craig Topper	60ad6e3fa4	[SelectionDAG][RISCV] Use VP_LOAD to widen MLOAD in type legalization when possible. (#140595 ) Padding the mask using 0 elements doesn't work for scalable vectors. Use VP_LOAD and change the VL instead. This fixes crash for Zve32x. Test file was split since i64 isn't a valid element type for Zve32x. Fixes #140198.	2025-05-21 15:52:08 -07:00
Craig Topper	ee4002da2b	[TargetLowering] Use getExtractSubvector/getExtractVectorElt. NFC	2025-05-21 12:06:54 -07:00
Paul Walker	5dfaf8418d	[LLVM][AArch64] Correctly lower funnel shifts by constants. (#140058 ) Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we should not rely on this and so this PR mirrors the same optimisation. Ensure LowerFunnelShift normalises constant shift amounts because isel rules expect them to be in the range [0, src bit length). NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).	2025-05-20 11:15:21 +01:00
Benjamin Maxwell	c9d6249198	[SDAG] Ensure load is included in output chain of sincos expansion (#140525 ) The load not being included in the chain meant that it could materialize after a `@llvm.lifetime.end` annotation on the pointer. This could result in miscompiles if the stack slot is reused for another value. Fixes https://github.com/llvm/llvm-project/issues/140491	2025-05-20 10:43:50 +01:00
David Green	b95ad8eca6	[DAGCombine] Use isLegalExtLoad for MatchLoadCombine (#140536 ) This looks wrong to me, but I don't have a test case where it alters the generated code.	2025-05-20 09:59:41 +01:00
Alexander Richardson	07e2ba445d	[AMDGPU] Set AS8 address width to 48 bits Of the 128-bits of buffer descriptor only 48 bits are address bits, so following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54, the logic conclusion is to set the index width to 48 bits instead of the current value of 128. Most of the test changes are mechanical datalayout updates, but there is one actual change: the ptrmask test now uses .i48 instead of .i128 and I had to update SelectionDAGBuilder to correctly extend the mask. Reviewed By: krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139419	2025-05-19 17:26:05 -07:00
Liam Semeria	d067014f13	[APInt] Added APInt::clearBits() method (#137098 ) Added APInt::clearBits(unsigned loBit, unsigned hiBit) that clears bits within a certain range. Fixes #136550 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-05-19 12:41:04 +01:00
Piotr Fusik	9e22f9611a	[DAGCombiner] Fix a "subtraction if above a constant threshold" miscompile (#140042 ) This fixes #135194 incorrectly reusing the existing `add nuw/nsw` while the transformed code relies on an unsigned wrap.	2025-05-17 12:18:52 +02:00
Craig Topper	aaaae99663	[SelectionDAG] Use getInsertSubvector/VectorElt and getExtractSubvector/VectorElt in LegalizeVectorTypes. NFC	2025-05-16 23:37:03 -07:00
Craig Topper	dcd62f3674	[SelectionDAG] Rename MemSDNode::getOriginalAlign to getBaseAlign. NFC (#139930 ) This matches the underlying function in MachineMemOperand and how it is printed when BaseAlign differs from Align.	2025-05-16 09:37:02 -07:00
Pierre van Houtryve	5e7bc5e080	[DAGCombiner] Remove hasOneUse check from sext+sext_inreg to sext_inreg combine (#140207 ) The hasOneUseCheck does not really add anything and makes the combine too restrictive. Upcoming patches benefit from removing the hasOneUse check.	2025-05-16 10:25:49 +02:00
Kazu Hirata	18ecff4f65	[llvm] Use llvm::stable_sort (NFC) (#140067 )	2025-05-15 12:18:18 -07:00
Alexander Peskov	2bc9f43ba1	[DAGCombiner] Fold pattern for srl-shl-zext (REAPPLIED) (#140038 ) Fold (srl (lop x, (shl (zext y), c1)), c1) -> (lop (srl x, c1), (zext y)) where c1 <= leadingzeros(zext(y)). This is equivalent of existing fold chain (srl (shl (zext y), c1), c1) -> (and (zext y), mask) -> (zext y), but logical op in the middle prevents it from combining. Profit : Allow to reduce the number of instructions. Original commit: #138290 / bbc5221 Previously reverted due to conflict in LIT test. Mainline changed default version of load instruction to untyped version by this #137698 . Updated test uses `ld.param.b64` instead of `ld.param.u64`.	2025-05-15 18:04:33 +01:00
Kazu Hirata	9658c55116	[SelectionDAG] Fix a warning This patch fixes: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7506:17: error: unused variable 'NoFPClass' [-Werror,-Wunused-variable]	2025-05-15 07:05:33 -07:00
Kerry McLaughlin	0bc3993716	[SelectionDAG] Add an ISD node for for get.active.lane.mask (#139084 ) For now expansion still happens in SelectionDAGBuilder when GET_ACTIVE_LANE_MASK is not legal on the target. This patch also includes changes in AArch64ISelLowering to replace handling of the get.active.lane.mask intrinsic to use the ISD node. Tablegen patterns are added which match to whilelo for scalable types. A follow up change will add support for more types to be lowered to GET_ACTIVE_LANE_MASK by allowing splitting of the node.	2025-05-15 09:14:46 +01:00
YunQiang Su	780054d3ff	CodeGen: Add ISD::AssertNoFPClass (#138839 ) It is used to mark a value that we are sure that it is not some fcType. The examples include: * An arguments of a function is marked with nofpclass * Output value of an intrinsic can be sure to not be some type So that the following operation can make some assumptions.	2025-05-15 16:05:15 +08:00
Simon Pilgrim	ca912c7c08	Revert bbc5221c95343d8d6869dce83d6fcf183767bd9f "[DAGCombiner] Fold pattern for srl-shl-zext" (#139876 ) Reverts llvm/llvm-project#138290 due to buildbot failures in shift-opt.ll	2025-05-14 12:13:54 +01:00
Alexander Peskov	bbc5221c95	[DAGCombiner] Fold pattern for srl-shl-zext (#138290 ) Fold `(srl (lop x, (shl (zext y), c1)), c1) -> (lop (srl x, c1), (zext y))` where c1 <= leadingzeros(zext(y)). This is equivalent of existing fold chain `(srl (shl (zext y), c1), c1) -> (and (zext y), mask) -> (zext y)`, but logical op in the middle prevents it from combining. Profit : Allow to reduce the number of instructions. --------- Signed-off-by: Alexander Peskov <apeskov@nvidia.com>	2025-05-14 11:57:55 +01:00
AZero13	af6261b50b	[DAG] visitINSERT_VECTOR_ELT - convert to or mask if all insertions are -1 (#138213 ) We did this for 0 and and, but we can do this with or and -1. Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-05-13 17:10:54 +01:00
Paul Walker	e01bdc18e3	[LLVM][SelectionDAG] Simplify SplitVecOp_VSETCC. (#139295 ) Preserving the original result element type when splitting vector setcc operations removes redundant extensions that are awkward to optimise after the fact.	2025-05-13 12:13:01 +01:00
Matt Arsenault	2f9323bc5b	DAG: Stop forcibly adding nsz to expanded minnum/maxnum (#139615 )	2025-05-13 07:37:21 +02:00
Rux124	ef40ae4f4e	[SelectionDAG] Fix incorrect fold condition in foldSetCCWithFunnelShift. (#137637 ) Proposed by [2ed1598](`2ed15984b4`): `fshl X, (or X, Y), C ==/!= 0 --> or (srl Y, BW-C), X ==/!= 0` This transformation is valid when (C%Bitwidth) != 0 , as verified by [Alive2](https://alive2.llvm.org/ce/z/TQYM-m). Fixes #136746	2025-05-12 13:25:07 +08:00
Kazu Hirata	50e949f3cc	[IR] Teach getAsmString to return StringRef (NFC) (#139406 ) This is for consistency with #139401.	2025-05-10 22:59:09 -07:00
Philip Reames	80370465d9	[DAG] Add wrappers for insert_vector_elt and extract_vector_elt [nfc] (#139141 ) As with the recently added subvector variants, provide the unsigned index operand to simplify a bunch of code. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2025-05-09 06:37:58 -07:00
Philip Reames	cf2f558501	[DAG/RISCV] Continue mitgrating to getInsertSubvector and getExtractSubvector Follow up to 6e654caab, use the new routines in more places. Note that I've excluded from this patch any case which uses a getConstant index instead of a getVectorIdxConstant index just to minimize room for error. I'll get those in a separate follow up.	2025-05-08 09:40:45 -07:00
Philip Reames	6e654caabe	[DAG] Add wrappers for insert and extract sub-vector [nfc] (#137230 ) Mechanical change to introduce the new wrappers, and add enough users to make the usage pattern clear. Once this lands, I'm going to do a further pass to adjust more callsites as separate changes. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2025-05-08 06:49:37 -07:00
Philip Reames	650dca5d89	[IR] Remove the AtomicMemInst helper classes (#138710 ) Migrate their usage to the `AnyMemInst` family, and add a isAtomic() query on the base class for that hierarchy. This matches the idioms we use for e.g. isAtomic on load, store, etc.. instructions, the existing isVolatile idioms on mem* routines, and allows us to more easily share code between atomic and non-atomic variants. As with #138568, the goal here is to simplify the class hierarchy and make it easier to reason about. I'm moving from easiest to hardest, and will stop at some point when I hit "good enough". Longer term, I'd sorta like to merge or reverse the naming on the plain MemInst and the AnyMemInst, but that's a much larger and more risky change. Not sure I'm going to actually do that.	2025-05-06 14:24:40 -07:00
Nicholas Guy	a8ed244178	[DAGCombiner] Add DAG combine for PARTIAL_REDUCE_MLA when no mul op (#131326 ) Generic DAG combine for ISD::PARTIAL_REDUCE_U/SMLA to convert: PARTIAL_REDUCE_MLA(Acc, ZEXT(UnextOp1), Splat(1)) into PARTIAL_REDUCE_UMLA(Acc, UnextOp1, TRUNC(Splat(1))) and PARTIAL_REDUCE_MLA(Acc, SEXT(UnextOp1), Splat(1)) into PARTIAL_REDUCE_SMLA(Acc, UnextOp1, TRUNC(Splat(1))). --------- Co-authored-by: James Chesterman <james.chesterman@arm.com>	2025-05-06 16:54:39 +01:00
Philip Reames	d1b3eeb244	[SDAG] Merge memcpy and memcpy.inline lowering paths (#138619 ) This is a follow up to c0a264e, but note that there is a functional difference here: the root changes for the memcpy.inline case. This difference appears to have been accidental, but I kept this back to facility separate review in case there's something I'm missing here.	2025-05-06 07:37:44 -07:00
Sander de Smalen	d90cac9641	[DAGCombine] Simplify partial_reduce_mla with constant. (#138289 ) partial_reduce_mla(acc, mul(ext(x), splat(C)), splat(1)) -> partial_reduce_*mla(acc, x, C)	2025-05-06 13:51:52 +01:00
Simon Pilgrim	bde39d7251	[DAG] Add SDPatternMatch::m_BitwiseLogic common matcher for AND/OR/XOR nodes (#138301 )	2025-05-06 12:50:50 +01:00
Philip Reames	c0a264e6a9	[IntrinsicInst] Remove MemCpyInlineInst and MemSetInlineInst [nfc] (#138568 ) I'm looking for ways to simplify the Mem*Inst class structure, and these two seem to have fairly minimal justification, so let's remove them.	2025-05-05 14:07:31 -07:00
Kazu Hirata	cdc9a4b5f8	[CodeGen] Use range-based for loops (NFC) (#138488 ) This is a reland of #138434 except that: - the bits for llvm/lib/CodeGen/RenameIndependentSubregs.cpp have been dropped because they caused a test failure under asan, and - the bits for llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp have been improved with structured bindings.	2025-05-05 10:08:49 -07:00
Kazu Hirata	f81193ddfd	[SelectionDAG] Remove obsolete comments (NFC) (#138483 ) These functions do not return boolean values.	2025-05-05 10:08:19 -07:00
Kazu Hirata	aa15596b5f	[llvm] Remove unused local variables (NFC) (#138478 )	2025-05-04 21:33:54 -07:00
Nico Weber	1d955489c3	Revert "[CodeGen] Use range-based for loops (NFC) (#138434 )" This reverts commit a9699a334bc9666570418a3bed9520bcdc21518b. Breaks CodeGen/AMDGPU/collapse-endcf.ll in several configs (sanitizer builds; macOS; possibly more), see comments on https://github.com/llvm/llvm-project/pull/138434	2025-05-04 17:36:52 -04:00
Kazu Hirata	c51a3aa6ce	[llvm] Remove unused local variables (NFC) (#138467 )	2025-05-04 13:05:18 -07:00
Kazu Hirata	47f391fd0e	[CodeGen] Remove unused local variables (NFC) (#138441 )	2025-05-04 00:26:37 -07:00
Kazu Hirata	a9699a334b	[CodeGen] Use range-based for loops (NFC) (#138434 )	2025-05-04 00:26:19 -07:00

1 2 3 4 5 ...

14192 Commits