llvm-project

Author	SHA1	Message	Date
AZero13	dcea5f1f38	[TargetLowering] Fold (a \| b) ==/!= b -> (a & ~b) ==/!= 0 when and-not exists (#145368 ) This is especially helpful for AArch64, which simplifies ands + cmp to tst. Alive2: https://alive2.llvm.org/ce/z/LLgcJJ --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-06-27 14:47:52 +01:00
Matt Arsenault	7255c3aee3	DAG: Check libcall function is supported before emission (#144314 )	2025-06-27 18:09:04 +09:00
Björn Pettersson	fd3cc204de	[SelectionDAG] Fold undemanded operand to UNDEF for VECTOR_SHUFFLE (#145524 ) Always let SimplifyDemandedVectorElts fold either side of a VECTOR_SHUFFLE to UNDEF if no elements are demanded from that side. For a single use this could be done by SimplifyDemandedVectorElts already, but in case the operand had multiple uses we did not eliminate the use.	2025-06-25 16:05:54 +02:00
Iris Shi	f2eb5d416e	[SelectionDAG] Handle `fneg`/`fabs`/`fcopysign` in `SimplifyDemandedBits` (#139239 )	2025-06-22 22:48:59 +08:00
Paul Walker	68732ce8e0	[LLVM][CodeGen][SVE] Add isel for bfloat unordered reductions. (#143540 ) The omissions are VECREDUCE_SEQ_* and MUL. The former goes down a different code path and the latter is unsupported across all element types.	2025-06-20 11:46:25 +01:00
Matt Arsenault	97bfb936af	DAG: Move soft float predicate management into RuntimeLibcalls (#142905 ) Work towards making RuntimeLibcalls the centralized location for all libcall information. This requires changing the encoding from tracking the ISD::CondCode to using CmpInst::Predicate.	2025-06-17 09:42:53 +09:00
Matt Arsenault	505c550e4c	DAG: Assert fcmp uno runtime calls are boolean values (#142898 ) This saves 2 instructions in the ARM soft float case for fcmp ueq. This code is written in an confusingly overly general way. The point of getCmpLibcallCC is to express that the compiler-rt implementations of the FP compares are different aliases around functions which may return -1 in some cases. This does not apply to the call for unordered, which returns a normal boolean. Also stop overriding the default value for the unordered compare for ARM. This was setting it to the same value as the default, which is now assumed.	2025-06-10 10:46:29 +09:00
Philip Reames	939666380f	[SDAG] Add partial_reduce_sumla node (#141267 ) We have recently added the partial_reduce_smla and partial_reduce_umla nodes to represent Acc += ext(b) * ext(b) where the two extends have to have the same source type, and have the same extend kind. For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which correspond to the existing nodes, but we also have vqdotsu which represents the case where the two extends are sign and zero respective (i.e. not the same type of extend). This patch adds a partial_reduce_sumla node which has sign extension for A, and zero extension for B. The addition is somewhat mechanical.	2025-06-09 07:17:45 -07:00
Nikita Popov	d74831efeb	Revert "[SDAG] Fix fmaximum legalization errors (#142170 )" This reverts commit 58cc1675ec7b4aa5bc2dab56180cb7af1b23ade5. I also made the incorrect assumption that we know both values are +/-0.0 here as well. Revert for now.	2025-06-04 14:35:30 +02:00
Nikita Popov	42605b8aa3	Revert "[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732 )" This reverts commit 54da543a14da6dd0e594875241494949cb659b08. I made a logic error here with the assumption that both values are known to be +/-0.0.	2025-06-04 14:22:19 +02:00
Nikita Popov	54da543a14	[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732 ) When ordering signed zero, only check the sign of one of the values. We already know at this point that both values must be +/-0.0, so it is sufficient to check one of them to correctly order them. For example, for fmaximum, if we know LHS is `+0.0` then we can always select LHS, value of RHS does not matter. If LHS is `-0.0` we can always select RHS, value of RHS doesn't matter.	2025-06-04 10:41:30 +02:00
YunQiang Su	bd831372b2	expandFMINIMUMNUM_FMAXIMUMNUM: Quiet is not needed for NaN vs NaN (#139237 ) New LangRef doesn't requires quieting for NaN vs NaN, aka the result may be sNaN for sNaN vs NaN. See: https://github.com/llvm/llvm-project/pull/139228	2025-06-04 08:20:48 +08:00
Nikita Popov	58cc1675ec	[SDAG] Fix fmaximum legalization errors (#142170 ) FMAXIMUM is currently legalized via IS_FPCLASS for the signed zero handling. This is problematic, because it assumes the equivalent integer type is legal. Many targets have legal fp128, but illegal i128, so this results in legalization failures. Fix this by replacing IS_FPCLASS with checking the bitcast to integer instead. In that case it is sufficient to use any legal integer type, as we're just interested in the sign bit. This can be obtained via a stack temporary cast. There is existing FloatSignAsInt functionality used for legalization of FABS and similar we can use for this purpose. Fixes https://github.com/llvm/llvm-project/issues/139380. Fixes https://github.com/llvm/llvm-project/issues/139381. Fixes https://github.com/llvm/llvm-project/issues/140445.	2025-06-02 10:14:33 +02:00
Tim Gymnich	760bf4f116	[GISel] Add KnownFPClass Analysis to GISelValueTrackingPass (#134611 ) - add KnownFPClass analysis to GISelValueTrackingPass - add MI pattern for `m_GIsFPClass`	2025-05-23 14:38:51 +02:00
Craig Topper	ee4002da2b	[TargetLowering] Use getExtractSubvector/getExtractVectorElt. NFC	2025-05-21 12:06:54 -07:00
Liam Semeria	d067014f13	[APInt] Added APInt::clearBits() method (#137098 ) Added APInt::clearBits(unsigned loBit, unsigned hiBit) that clears bits within a certain range. Fixes #136550 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-05-19 12:41:04 +01:00
Craig Topper	dcd62f3674	[SelectionDAG] Rename MemSDNode::getOriginalAlign to getBaseAlign. NFC (#139930 ) This matches the underlying function in MachineMemOperand and how it is printed when BaseAlign differs from Align.	2025-05-16 09:37:02 -07:00
Kazu Hirata	18ecff4f65	[llvm] Use llvm::stable_sort (NFC) (#140067 )	2025-05-15 12:18:18 -07:00
Matt Arsenault	2f9323bc5b	DAG: Stop forcibly adding nsz to expanded minnum/maxnum (#139615 )	2025-05-13 07:37:21 +02:00
Rux124	ef40ae4f4e	[SelectionDAG] Fix incorrect fold condition in foldSetCCWithFunnelShift. (#137637 ) Proposed by [2ed1598](`2ed15984b4`): `fshl X, (or X, Y), C ==/!= 0 --> or (srl Y, BW-C), X ==/!= 0` This transformation is valid when (C%Bitwidth) != 0 , as verified by [Alive2](https://alive2.llvm.org/ce/z/TQYM-m). Fixes #136746	2025-05-12 13:25:07 +08:00
Kazu Hirata	c51a3aa6ce	[llvm] Remove unused local variables (NFC) (#138467 )	2025-05-04 13:05:18 -07:00
Kazu Hirata	47f391fd0e	[CodeGen] Remove unused local variables (NFC) (#138441 )	2025-05-04 00:26:37 -07:00
Simon Pilgrim	a99e055030	[DAG] shouldReduceLoadWidth - add optional<unsigned> byte offset argument (#136723 ) Based off feedback for #129695 - we need to be able to determine the load offset of smaller loads when trying to determine whether a multiple use load should be split (in particular for AVX subvector extractions). This patch adds a std::optional<unsigned> ByteOffset argument to shouldReduceLoadWidth calls for where we know the constant offset to allow targets to make use of it in future patches.	2025-04-23 12:30:27 +01:00
Sergei Barannikov	11a3de7e98	[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#101786 ) This is a reland of #99752 with the bug fixed (see test diff in the third commit in this PR). All `popcount` libcalls return `int`, but `ISD::CTPOP` returns the type of the argument, which can be wider than `int`. The fix is to make DAG legalizer pass the correct return type to `makeLibCall` and sign-extend the result afterwards. Original commit message: The main change is adding CTPOP to `RuntimeLibcalls.def` to allow targets to use LibCall action for CTPOP. DAG legalizers are changed accordingly. Pull Request: https://github.com/llvm/llvm-project/pull/101786	2025-04-23 12:43:05 +03:00
Simon Pilgrim	64ffecfc43	[DAG] isKnownNeverNaN - add DemandedElts element mask to isKnownNeverNaN calls (#135952 ) Matches what we've done for computeKnownBits etc. to improve vector handling	2025-04-18 09:24:02 +01:00
Reid Kleckner	2538c607e9	[CodeGen] Prune headers and move code out of line for build efficiency, NFC (#135622 ) I noticed these destructors taking time with -ftime-trace and moved some of them for minor build efficiency improvements. The main impact of moving destructors out of line is that it avoids requiring container fields containing other types from being complete, i.e. one can have uptr<T> or vector<T> as a field with an incomplete type T, and that means we can reduce transitive includes, as with LegalizerInfo.h. Move expensive getDebugOperandsForReg template out-of-line. The std::function instantiation shows up in time trace even if you don't use the function.	2025-04-14 22:23:18 -07:00
Jay Foad	344a491dad	[CodeGen] Simplify expandRoundInexactToOdd (#134988 ) FP_ROUND and FP_EXTEND the input value before FABSing it. This avoids some bit twiddling to copy the sign bit from the input to the result. It does introduce one extra FABS, but that is folded into another instruction for free on AMDGPU, which is the only target currently affected by this change.	2025-04-10 09:45:38 +01:00
David Green	6c27817294	[SelectionDAG] Use SimplifyDemandedBits from SimplifyDemandedVectorElts Bitcast. (#133717 ) This adds a call to SimplifyDemandedBits from bitcasts with scalar input types in SimplifyDemandedVectorElts, which can help simplify the input scalar.	2025-04-03 11:14:08 +01:00
Tim Gymnich	1d0005a69a	[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (#133466 ) - rename `GISelKnownBits` to `GISelValueTracking` to analyze more than just `KnownBits` in the future	2025-03-29 11:51:29 +01:00
Benjamin Maxwell	a5a162cd71	[SDAG] Pass pointer type to libcall expansion for SoftenFloatRes stack slots (#130647 ) Solution for: https://github.com/llvm/llvm-project/pull/129264#issuecomment-2710079843	2025-03-13 10:30:10 +00:00
Fangrui Song	0c5d709301	Move MIPS-specific GPRel32Directive and EK_GPRel32BlockAddress from generic code to Mips/ Follow-up to 60486292b79885b7800b082754153202bef5b1f0 gprel/gprel64 functions can now be moved from MCTargetStreamer to MipsTargetStreamer.	2025-03-02 15:37:55 -08:00
Matt Arsenault	37c341df28	Revert "AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711 )" This reverts commit 36eaf0daf5d6dd665d7c7a9ec38ea22f27709fed. This is not a sound approach to dealing with this instruction change. The new behavior is a different opcode pair, not a modifier on the existing opcode.	2025-02-20 10:19:14 +07:00
Changpeng Fang	36eaf0daf5	AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711 ) For targets that support IEEE fminimum_num/fmaximum_num, the corresponding _min_num_fXY/_max_num_fXY instructions themselves already did the canonicalization for the inputs. As a result, we do not need to explicitly canonicalize the inputs for fminnum/fmaxnum.	2025-02-19 11:16:43 -08:00
James Chesterman	d4a0848dc6	[SelectionDAG] Add PARTIAL_REDUCE_U/SMLA ISD Nodes (#125207 ) Add signed and unsigned PARTIAL_REDUCE_MLA ISD nodes. Add command line argument (aarch64-enable-partial-reduce-nodes) that indicates whether the intrinsic experimental_vector_partial_ reduce_add will be transformed into the new ISD node. Lowering with the new ISD nodes will, for now, always be done as an expand.	2025-02-18 09:08:47 +00:00
Matt Arsenault	c55a7659b3	DAG: Move scalarizeExtractedVectorLoad to TargetLowering (#122670 ) SimplifyDemandedVectorElts should be able to use this on loads	2025-02-04 17:37:12 +07:00
David Green	cae0d67cba	[AArch64][SDAG] Detect non-zeroes in truncating buildvectors in fshl lowering (#123597 ) A BUILD_VECTOR can implicity shrink the bits of the operands if the operand types are not legal. For example a v8i16 constant BUILD_VECTOR might be represented as v8i16 BUILDVECTOR(i32 1, i32 2, ...). Unfortunately this means that the constants are not accepted by matchUnaryPredicateImpl, preventing in this case funnel shifts detecting that all the operands are non-zero. Add a flag to help it match.	2025-02-03 10:47:45 +00:00
Craig Topper	d839e765f0	[TargetLowering] Inline the only caller of one of the forceExpandWideMUL functions. NFC This caller does not need the libcall portion so it can directly call forceExpandMultiply.	2025-01-27 17:10:37 -08:00
Craig Topper	4bcd8184a0	[TargetLowering] Pull similar code out of the forceExpandWideMUL into a helper. NFC (#124371 ) These functions have similar code. One of them calculates the 2x width full product from 2 sources. The other calculates the product from 2 sources that have low and high halves. This patch introduces a new function that takes HiLHS and HiRHS as optional values. If they are not null, they will be used in the calculation of the Hi half. The Signed flag can only be set when HiLHS/HiRHS are null.	2025-01-25 10:53:01 -08:00
Craig Topper	e30a4fc3e2	[TargetLowering] Improve one signature of forceExpandWideMUL. (#123991 ) We have two forceExpandWideMUL functions. One takes the low and high half of 2 inputs and calculates the low and high half of their product. This does not calculate the full 2x width product. The other signature takes 2 inputs and calculates the low and high half of their full 2x width product. Previously it did this by sign/zero extending the inputs to create the high bits and then calling the other function. We can instead copy the algorithm from the other function and use the Signed flag to determine whether we should do SRA or SRL. This avoids the need to multiply the high part of the inputs and add them to the high half of the result. This improves the generated code for signed multiplication. This should improve the performance of #123262. I don't know yet how close we will get to gcc.	2025-01-23 12:49:35 -08:00
Craig Topper	cdd321462a	[TargetLowering] Use getShiftAmountConstant. NFC (#123802 ) Previously we always used the pointer size which might need to be legalized on some targets.	2025-01-21 12:05:52 -08:00
Graham Hunter	d9f165ddea	[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810 ) Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder. The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.	2025-01-20 12:57:05 +00:00
Craig Topper	e2449f1bce	[SelectionDAG] Use SDNode::op_iterator instead of SDNodeIterator. NFC (#122147 ) I think SDNodeIterator primarily exists because GraphTraits requires an iterator that dereferences to SDNode. op_iterator dereferences to SDUse which is implicitly convertible to SDValue. This piece of code can use SDValue instead of SDNode* so we should prefer to use the the more common op_iterator.	2025-01-09 09:09:55 -08:00
Simon Pilgrim	112793a90e	[DAG] expandUINT_TO_FP - use getShiftAmountConstant helper. NFC. Don't bother with separate getShiftAmountTy/getConstant calls.	2025-01-06 18:49:50 +00:00
Alex MacLean	6018820c48	[NVPTX] Fix lowering of i1 SETCC (#115035 ) Add DAG legalization support for expanding i1 SETCC nodes using appropriate logical operations to simulate integer comparisons. Use these expansions to handle i1 SETCC in NVPTX. fixes #58428 and #57405	2024-12-05 12:54:24 -08:00
Simon Pilgrim	b1a48af56a	[DAG] SimplifyDemandedVectorElts - add handling for INT<->FP conversions (#117884 )	2024-12-04 07:37:01 +00:00
Craig Topper	b076fbb844	[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587 ) I want to use this function for GISel too so Type * is a better common interface. All of the callers already convert EVT to Type * as needed by calling lowering anyway.	2024-12-03 22:06:55 -08:00
Craig Topper	caa8aa551b	[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574 ) This is eventually passed to shouldSignExtendTypeInLibCall which calls it IsSigned.	2024-12-03 18:25:44 -08:00
Félix-Antoine Constantin	7a56dc7245	[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741 ) Fixes #111950	2024-11-22 17:28:45 -08:00
Simon Pilgrim	51809e4a26	[DAG] SimplifyDemandedVectorElts - add SimplifyMultipleUse handling to SEXT/ZEXT/TRUNC nodes (#116227 ) Allows us to bypass multiple uses of a SEXT/ZEXT/TRUNC node operand	2024-11-16 12:40:42 +00:00
Sam Elliott	862f42eedf	[TargetLowering] Use Correct VT for Multi-out Asm (#116024 ) This was overlooked in 7d940432c46be83b8fcb5dbefee439585fa820cd - when inline assembly has multiple outputs, they are returned as members of a struct, and the `getAsmOperandType` needs to be called for each member of struct. The difference between this and the single-output case is that in the latter, there isn't a struct wrapping the outputs. I noticed this when trying to use the same mechanism in the RISC-V backend. Committing two tests: - One that shows a crash before this change, which is fixed by this change. - One (commented out) that shows a different crash with tied inputs/outputs. This is commented as it is not fixed by this change and needs more work in target-independent inline asm handling code.	2024-11-14 12:31:31 +00:00

1 2 3 4 5 ...

1586 Commits