llvm-project

Author	SHA1	Message	Date
Nikita Popov	e92b7e9641	[CodeGen] Provide original IR type to CC lowering (NFC) (#152709 ) It is common to have ABI requirements for illegal types: For example, two i64 argument parts that originally came from an fp128 argument may have a different call ABI than ones that came from a i128 argument. The current calling convention lowering does not provide access to this information, so backends come up with various hacks to support it (like additional pre-analysis cached in CCState, or bypassing the default logic entirely). This PR adds the original IR type to InputArg/OutputArg and passes it down to CCAssignFn. It is not actually used anywhere yet, this just does the mechanical changes to thread through the new argument.	2025-08-11 08:57:53 +02:00
Yingwei Zheng	62735d26b1	[DAGCombine] Correctly extend the constant RHS in `TargetLowering::SimplifySetCC` (#152862 ) In https://github.com/llvm/llvm-project/pull/150270, when the predicate is eq/ne and the trunc has only an nsw flag, the RHS is incorrectly zero-extended. Closes https://github.com/llvm/llvm-project/issues/152630.	2025-08-10 01:24:37 +08:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
woruyu	95b16d1264	[DAG] Fold trunc(abdu(x,y)) and trunc(abds(x,y)) if they have sufficient leading zero/sign bits (#151471 ) This PR resolves https://github.com/llvm/llvm-project/issues/147683 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-08 10:43:14 +01:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Benjamin Maxwell	94c48a21bb	[AArch64][SVE] Fix hang in VECTOR_HISTOGRAM DAG combine (#152539 ) The histogram DAG combine went into an infinite loop of creating the same histogram node due to an incorrect use of the `refineUniformBase` and `refineIndexType` APIs. These APIs take SDValues by reference (SDValue&) and return `true` if they were "refined" (i.e., set to new values). Previously, this DAG combine would create the `Ops` array (used to create the new histogram node) before calling the `refine*` APIs, which copies the SDValues into the array, meaning the updated values were not used to create the new histogram node. Reproducer: https://godbolt.org/z/hsGWhTaqY (it will timeout)	2025-08-08 09:59:24 +01:00
David Stuttard	c7c0229480	Revert "[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#147560 )" (#152548 ) This reverts commit 9293b65a616b8de432a654d046e802540b146372.	2025-08-08 09:05:59 +01:00
zhijian lin	093439c688	[PowerPC][AIX] Using milicode for memcmp instead of libcall (#147093 ) AIX has "millicode" routines, which are functions loaded at boot time into fixed addresses in kernel memory. This allows them to be customized for the processor. The __memcmp routine is a millicode implementation; we use millicode for the memcmp function instead of a library call to improve performance.	2025-08-07 13:13:56 -04:00
Chaitanya Koparkar	6ce68d3a12	[DAG] canCreateUndefOrPoison - add FP_EXTEND (#152249 ) Fixes https://github.com/llvm/llvm-project/issues/152141	2025-08-07 09:23:46 +01:00
Nikita Popov	406d9b1dd6	[CodeGen] Move IsFixed into ArgFlags (NFCI) (#152319 ) The information whether a specific argument is vararg or fixed is currently stored separately from all the other argument information in ArgFlags. This means that it is not accessible from CCAssign, and backends have developed all kinds of workarounds for how they can access it after all. Move this information to ArgFlags to make it directly available in all relevant places. I've opted to invert this and store it as IsVarArg, as I think that both makes the meaning more obvious and provides for a better default (which is IsVarArg=false).	2025-08-07 09:12:40 +02:00
Craig Topper	57045a137f	[DAGCombiner] Avoid repeated calls to WideVT.getScalarSizeInBits() in DAGCombiner::mergeTruncStores. NFC (#152231 ) We already have a variable, WideNumBits, that contains the same information. Use it and delay the creation of WideVT until we really need it.	2025-08-06 09:10:02 -07:00
Simon Pilgrim	c4f6d34674	[DAG] getNode - fold (sext (trunc x)) -> x iff the upper bits are already signbits (#151945 ) Similar to what we already do for ZERO_EXTEND/ANY_EXTEND patterns.	2025-08-06 14:55:46 +01:00
Diana Picus	14cd133931	Revert "[AMDGPU] Intrinsic for launching whole wave functions" (#152286 ) Reverts llvm/llvm-project#145859 because it broke a HIP test: ``` [34/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o FAILED: External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG -O3 -DNDEBUG -w -Werror=date-time --rocm-path=/opt/botworker/llvm/External/hip/rocm-6.3.0 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /home/botworker/bbot/clang-hip-vega20/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.readfirstlane ```	2025-08-06 12:24:52 +02:00
Diana Picus	0461cd3d1d	[AMDGPU] Intrinsic for launching whole wave functions (#145859 ) Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Unspeakable horrors happen around calls from whole wave functions, the plan is to improve the handling of caller/callee-saved registers in a future patch. Tail calls are also handled in a future patch.	2025-08-06 10:25:53 +02:00
Alex MacLean	d27802a217	[DAGCombiner] Fold setcc of trunc, generalizing some NVPTX isel logic (#150270 ) That change adds support for folding a SETCC when one or both of the operands is a TRUNCATE with the appropriate no-wrap flags. This pattern can occur when promoting i8 operations in NVPTX, and we currently have some ISel rules to try to handle it.	2025-08-05 19:20:17 -07:00
Craig Topper	73685583c8	[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. (#128593 ) There's been some interest in supporting early-exit loops recently. https://discourse.llvm.org/t/rfc-supporting-more-early-exit-loops/84690 This patch was extracted from our downstream where we've been using it in our vectorizer.	2025-08-05 16:12:42 -07:00
Simon Pilgrim	9f50224b25	[DAG] Remove Depth=1 hack from isGuaranteedNotToBeUndefOrPoison checks (#152127 ) Now that #146490 removed the assertion in visitFreeze to assert that the node was still isGuaranteedNotToBeUndefOrPoison we no longer need this reduced depth hack (which had to account for the difference in depth of freeze(op()) vs op(freeze()) Helps with some of the minor regressions in #150017	2025-08-05 13:35:04 +01:00
Simon Pilgrim	d561259a08	[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with just the frozen node (#150017 ) Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of hasOneUse() problems when trying to push FREEZE nodes through the DAG. Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we now want to keep the common node, and not bypass for some nodes just because of DemandedElts. Fixes #149799	2025-08-05 09:24:09 +01:00
Craig Topper	a3a8e1c064	[TargetLowering][RISCV] Use sra for (X & -256) == 256 -> (X >> 8) == 1 if it yields a better icmp constant. (#151762 ) If using srl does not produce a legal constant for the RHS of the final compare, try to use sra instead. Because the AND constant is negative, the sign bits participate in the compare. Using an arithmetic shift right duplicates that bit.	2025-08-04 09:00:41 -07:00
woruyu	38bfe9ae56	[DAG] combineVSelectWithAllOnesOrZeros - missing freeze (#150388 ) This PR resolves https://github.com/llvm/llvm-project/issues/150069 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 15:55:12 +01:00
Simon Pilgrim	5c2054a4ea	[DAG] getMinMaxOpcodeForFP - split if-else chain. NFC. (#151938 ) (style) All cases return so split the chain	2025-08-04 15:32:08 +01:00
Abhishek Kaushik	1c0ac80d4a	[DAG] Combine `store + vselect` to `masked_store` (#145176 ) Add a new combine to replace ``` (store ch (vselect cond truevec (load ch ptr offset)) ptr offset) ``` to ``` (mstore ch truevec ptr offset cond) ``` This saves a blend operation on targets that support conditional stores.	2025-08-04 19:05:36 +05:30
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Min-Yih Hsu	7ebbbd885f	[DAG] Always use stack to promote bitcast when the source is vector (#151065 ) The optimization introduced by #125637 tried to avoid using stacks to promote bitcast with vector result type. However, it wouldn't be correct if the input type is vector. This patch limits that optimizations to only scalar to vector bitcasts.	2025-08-02 15:32:10 -07:00
Craig Topper	f952a84f2f	[TargetLowering] Use getShiftAmountConstant in buildSDIVPow2WithCMov.	2025-08-02 10:50:46 -07:00
AZero13	23022a4683	[SelectionDAG] Move sign pattern check from AArch64 and ARM to general SelectionDAG (#151736 ) This works on all cases much like the XOR case above it in SelectionDAG.	2025-08-01 14:46:51 -07:00
Paul Walker	ceb2b9c141	[LLVM][DAGCombiner] fold (shl (X * vscale(C0)), C1) -> (X * vscale(C0 << C1)). (#150651 )	2025-08-01 11:42:45 +01:00
黃國庭	f04ea2ef1c	Add m_SelectCCLike matcher to match SELECT_CC or SELECT with SETCC (#149646 ) Fix #147282 and Follow-up to #148834 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-01 10:12:05 +01:00
David Sherwood	05b16aff0f	[DAGCombiner] Add combine for vector interleave of splats (#151110 ) This patch adds two DAG combines: 1. vector_interleave(splat, splat, ...) -> {splat,splat,...} 2. concat_vectors(splat, splat, ...) -> wide_splat where all the input splats are identical. Both of these together enable us to fold concat_vectors(vector_interleave(splat, splat, ...)) into a wide splat. Post-legalisation we must only do the concat_vector combine if the wider type and splat operation is legal. For fixed-width vectors the DAG combine only occurs for interleave factors of 3 or more, however it's not currently safe to test this for AArch64 since there isn't any lowering support for fixed-width interleaves. I've only added fixed-width tests for RISCV.	2025-08-01 09:58:05 +01:00
Craig Topper	2737d013a0	[SelectionDAG] Improve the doxygen description for SDValue::isOperandOf. NFC (#151244 ) SDValue::isOperandOf checks the result number in addition to the SDNode. SDNode::isOperandOf only checks the SDNode.	2025-07-31 12:58:27 -07:00
Prabhu Rajasekaran	17ccb849f3	[llvm] Extract and propagate callee_type metadata Update MachineFunction::CallSiteInfo to extract numeric CalleeTypeIds from callee_type metadata attached to indirect call instructions. Reviewers: nikic, ilovepi Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87575	2025-07-30 14:56:39 -07:00
Paul Walker	13f38c97d5	[LLVM][SelectionDAG] Align poison/undef binop folds with IR. (#149334 ) The "at construction" binop folds in SelectionDAG::getNode() has different behaviour when compared to the equivalent LLVM IR. This PR makes the behaviour consistent while also extending the coverage to include signed/unsigned max/min operations.	2025-07-30 11:20:30 +01:00
Pierre van Houtryve	c4b1557097	[DAG] Fold (setcc ((x \| x >> c0 \| ...) & mask)) sequences (#146054 ) Fold sequences where we extract a bunch of contiguous bits from a value, merge them into the low bit and then check if the low bits are zero or not. Usually the and would be on the outside (the leaves) of the expression, but the DAG canonicalizes it to a single `and` at the root of the expression. The reason I put this in DAGCombiner instead of the target combiner is because this is a generic, valid transform that's also fairly niche, so there isn't much risk of a combine loop I think. See #136727	2025-07-30 10:27:19 +02:00
Craig Topper	eddd34227e	[TargetLowering] Use getShiftAmountConstant in CTTZTableLookup. NFC	2025-07-29 22:43:42 -07:00
Pierre van Houtryve	250f2a6367	[DAG] Remove AssertZext if the input is masked (#146052 ) Remove AssertZext if the input ensures the assert cannot fail.	2025-07-29 13:05:30 +02:00
Nikita Popov	ab1f6ce482	[IR][SDAG] Remove lifetime size handling from SDAG (#150944 ) Split out from https://github.com/llvm/llvm-project/pull/150248: Specify that the argument of lifetime.start/lifetime.end is ignored and will be removed in the future. Remove lifetime size handling from SDAG. The size was previously discarded during isel, so was always ignored for stack coloring anyway. Where necessary, obtain the size of the full frame index.	2025-07-29 09:53:59 +02:00
paperchalice	21836f4a49	[SelectionDAG] Remove `UnsafeFPMath` in LegalizeDAG (#146316 ) These global flags hinder further improvements like [[RFC] Honor pragmas with -ffp-contract=fast](https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast) and pass concurrency support. Remove them incrementally.	2025-07-29 08:41:21 +08:00
Matt Arsenault	1461a1c3b8	DAG: Emit an error if trying to legalize read/write register with illegal types (#145197 ) This is a starting point to have better legalization failure diagnostics	2025-07-26 10:54:59 +09:00
Hood Chatham	15715f4089	[WebAssembly,llvm] Add llvm.wasm.ref.test.func intrinsic (#147486 ) This adds an llvm intrinsic for WebAssembly to test the type of a function. It is intended for adding a future clang builtin ` __builtin_wasm_test_function_pointer_signature` so we can test whether calling a function pointer will fail with function signature mismatch. Since the type of a function pointer is just `ptr` we can't figure out the expected type from that. The way I figured out to encode the type was by passing 0's of the appropriate type to the intrinsic. The first argument gives the expected type of the return type and the later values give the expected type of the arguments. So ```llvm @llvm.wasm.ref.test.func(ptr %func, float 0.000000e+00, double 0.000000e+00, i32 0) ``` tests if `%func` is of type `(double, i32) -> (i32)`. It will lower to: ```wat local.get $func table.get $__indirect_function_table ref.test (double, i32) -> (i32) ``` To indicate the function should be void, I somewhat arbitrarily picked `token poison`, so the following tests for `(i32) -> ()`: ```llvm @llvm.wasm.ref.test.func(ptr %func, token poison, i32 0) ``` To lower this intrinsic, we need some place to put the type information. With `encodeFunctionSignature()` we encode the signature information into an `APInt`. We decode it in `lowerEncodedFunctionSignature` in `WebAssemblyMCInstLower.cpp`.	2025-07-22 14:07:34 -07:00
Craig Topper	7cb256bcaa	[SelectionDAG] Remove FIXME and commented out code from 20 years ago. NFC (#150055 )	2025-07-22 11:17:50 -07:00
Simon Pilgrim	c710d460a5	[DAG] expandVECTOR_COMPRESS - remove superfluous getFreeze. NFC. (#150062 ) freeze(freeze(extract_vector_elt(x,i))) -> freeze(extract_vector_elt(x,i))	2025-07-22 18:37:12 +01:00
Craig Topper	75ec7250aa	[SelectionDAG] Use SDUse::get() instead of a static_cast to SDValue. NFC (#150043 )	2025-07-22 09:28:02 -07:00
Craig Topper	8d549cf036	[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852 ) getNode updates flags correctly for CSE. Calling setFlags after getNode may set the flags where they don't apply. I've added a Flags argument to getSelectCC and the signature of getNode that takes an ArrayRef of EVTs.	2025-07-22 08:06:30 -07:00
Simon Pilgrim	c37942df00	[DAG] visitFREEZE - limit freezing of multiple operands (#149797 ) This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084). The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison. The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes. I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x. Fixes #148084	2025-07-22 15:40:55 +01:00
Simon Pilgrim	4b0625f051	[DAG] isNonZeroModBitWidthOrUndef - fix bugprone-argument-comment analyzer warning. NFC. matchUnaryPredicate argument is AllowUndefs not AllowUndef	2025-07-22 10:36:59 +01:00
Nikita Popov	a7a1df8f72	[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838 ) After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed that the argument is an alloca, so we don't need to look at underlying objects (which was not a correct thing to do anyway). This also drops the offset argument for lifetime nodes in SDAG. The offset is fixed to zero now. (Peculiarly, while SDAG pretended to have an offset, it just gets silently dropped during selection.)	2025-07-22 09:44:59 +02:00
Craig Topper	423cea7607	[SelectionDAG] Fix incorrect indentation. NFC	2025-07-21 13:06:21 -07:00
Simon Pilgrim	17c7c2ebe8	[DAG] Add missing Depth argument to isGuaranteedNotToBeUndefOrPoison calls inside SimplifyDemanded methods (#149550 ) Ensure we don't exceed the maximum recursion depth	2025-07-20 13:06:55 +01:00
Simon Pilgrim	92e2d4e9e1	[DAG] visitFREEZE - remove unused HadMaybePoisonOperands check. NFC. (#149517 ) Redundant since #145939	2025-07-18 17:38:11 +01:00
Annu Singh	148fd6ed0a	[DAG] Adding abdu/abds to canCreateUndefOrPoison (#149017 ) Fixes #147695 - [Alive2 test - freeze abdu](https://alive2.llvm.org/ce/z/aafeJs) - [Alive 2 test - freeze abds](https://alive2.llvm.org/ce/z/XrSmP4) --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-18 17:00:44 +01:00

1 2 3 4 5 ...

14368 Commits