1573 Commits

Author SHA1 Message Date
Tim Gymnich
760bf4f116
[GISel] Add KnownFPClass Analysis to GISelValueTrackingPass (#134611)
- add KnownFPClass analysis to GISelValueTrackingPass
- add MI pattern for `m_GIsFPClass`
2025-05-23 14:38:51 +02:00
Craig Topper
ee4002da2b [TargetLowering] Use getExtractSubvector/getExtractVectorElt. NFC 2025-05-21 12:06:54 -07:00
Liam Semeria
d067014f13
[APInt] Added APInt::clearBits() method (#137098)
Added APInt::clearBits(unsigned loBit, unsigned hiBit) that clears bits within a certain range.

Fixes #136550

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-05-19 12:41:04 +01:00
Craig Topper
dcd62f3674
[SelectionDAG] Rename MemSDNode::getOriginalAlign to getBaseAlign. NFC (#139930)
This matches the underlying function in MachineMemOperand and how it is
printed when BaseAlign differs from Align.
2025-05-16 09:37:02 -07:00
Kazu Hirata
18ecff4f65
[llvm] Use llvm::stable_sort (NFC) (#140067) 2025-05-15 12:18:18 -07:00
Matt Arsenault
2f9323bc5b
DAG: Stop forcibly adding nsz to expanded minnum/maxnum (#139615) 2025-05-13 07:37:21 +02:00
Rux124
ef40ae4f4e
[SelectionDAG] Fix incorrect fold condition in foldSetCCWithFunnelShift. (#137637)
Proposed by
[2ed1598](2ed15984b4):

`fshl X, (or X, Y), C ==/!= 0 --> or (srl Y, BW-C), X ==/!= 0`

This transformation is valid when (C%Bitwidth) != 0 , as verified by
[Alive2](https://alive2.llvm.org/ce/z/TQYM-m).

Fixes #136746
2025-05-12 13:25:07 +08:00
Kazu Hirata
c51a3aa6ce
[llvm] Remove unused local variables (NFC) (#138467) 2025-05-04 13:05:18 -07:00
Kazu Hirata
47f391fd0e
[CodeGen] Remove unused local variables (NFC) (#138441) 2025-05-04 00:26:37 -07:00
Simon Pilgrim
a99e055030
[DAG] shouldReduceLoadWidth - add optional<unsigned> byte offset argument (#136723)
Based off feedback for #129695 - we need to be able to determine the
load offset of smaller loads when trying to determine whether a multiple
use load should be split (in particular for AVX subvector extractions).

This patch adds a std::optional<unsigned> ByteOffset argument to
shouldReduceLoadWidth calls for where we know the constant offset to
allow targets to make use of it in future patches.
2025-04-23 12:30:27 +01:00
Sergei Barannikov
11a3de7e98
[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#101786)
This is a reland of #99752 with the bug fixed (see test diff in the
third commit in this PR).
All `popcount` libcalls return `int`, but `ISD::CTPOP` returns the type
of the argument, which can be wider than `int`. The fix is to make DAG
legalizer pass the correct return type to `makeLibCall` and sign-extend
the result afterwards.

Original commit message:
The main change is adding CTPOP to `RuntimeLibcalls.def` to allow
targets to use LibCall action for CTPOP. DAG legalizers are changed
accordingly.

Pull Request: https://github.com/llvm/llvm-project/pull/101786
2025-04-23 12:43:05 +03:00
Simon Pilgrim
64ffecfc43
[DAG] isKnownNeverNaN - add DemandedElts element mask to isKnownNeverNaN calls (#135952)
Matches what we've done for computeKnownBits etc. to improve vector handling
2025-04-18 09:24:02 +01:00
Reid Kleckner
2538c607e9
[CodeGen] Prune headers and move code out of line for build efficiency, NFC (#135622)
I noticed these destructors taking time with -ftime-trace and moved some
of them for minor build efficiency improvements.

The main impact of moving destructors out of line is that it avoids
requiring container fields containing other types from being complete,
i.e. one can have uptr<T> or vector<T> as a field with an incomplete
type T, and that means we can reduce transitive includes, as with
LegalizerInfo.h.

Move expensive getDebugOperandsForReg template out-of-line. The
std::function instantiation shows up in time trace even if you don't use
the function.
2025-04-14 22:23:18 -07:00
Jay Foad
344a491dad
[CodeGen] Simplify expandRoundInexactToOdd (#134988)
FP_ROUND and FP_EXTEND the input value before FABSing it. This avoids
some bit twiddling to copy the sign bit from the input to the result. It
does introduce one extra FABS, but that is folded into another
instruction for free on AMDGPU, which is the only target currently
affected by this change.
2025-04-10 09:45:38 +01:00
David Green
6c27817294
[SelectionDAG] Use SimplifyDemandedBits from SimplifyDemandedVectorElts Bitcast. (#133717)
This adds a call to SimplifyDemandedBits from bitcasts with scalar input
types in SimplifyDemandedVectorElts, which can help simplify the input
scalar.
2025-04-03 11:14:08 +01:00
Tim Gymnich
1d0005a69a
[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (#133466)
- rename `GISelKnownBits` to `GISelValueTracking` to analyze more than
just `KnownBits` in the future
2025-03-29 11:51:29 +01:00
Benjamin Maxwell
a5a162cd71
[SDAG] Pass pointer type to libcall expansion for SoftenFloatRes stack slots (#130647)
Solution for:
https://github.com/llvm/llvm-project/pull/129264#issuecomment-2710079843
2025-03-13 10:30:10 +00:00
Fangrui Song
0c5d709301 Move MIPS-specific GPRel32Directive and EK_GPRel32BlockAddress from generic code to Mips/
Follow-up to 60486292b79885b7800b082754153202bef5b1f0
gprel/gprel64 functions can now be moved from MCTargetStreamer
to MipsTargetStreamer.
2025-03-02 15:37:55 -08:00
Matt Arsenault
37c341df28 Revert "AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711)"
This reverts commit 36eaf0daf5d6dd665d7c7a9ec38ea22f27709fed.

This is not a sound approach to dealing with this instruction change.
The new behavior is a different opcode pair, not a modifier on the
existing opcode.
2025-02-20 10:19:14 +07:00
Changpeng Fang
36eaf0daf5
AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711)
For targets that support IEEE fminimum_num/fmaximum_num, the
corresponding *_min_num_fXY/*_max_num_fXY instructions themselves
already did the canonicalization for the inputs. As a result, we do not
need to explicitly canonicalize the inputs for fminnum/fmaxnum.
2025-02-19 11:16:43 -08:00
James Chesterman
d4a0848dc6
[SelectionDAG] Add PARTIAL_REDUCE_U/SMLA ISD Nodes (#125207)
Add signed and unsigned PARTIAL_REDUCE_MLA ISD nodes. Add command line
argument (aarch64-enable-partial-reduce-nodes) that indicates whether the
intrinsic experimental_vector_partial_ reduce_add will be transformed
into the new ISD node. Lowering with the new ISD nodes will, for now,
always be done as an expand.
2025-02-18 09:08:47 +00:00
Matt Arsenault
c55a7659b3
DAG: Move scalarizeExtractedVectorLoad to TargetLowering (#122670)
SimplifyDemandedVectorElts should be able to use this on loads
2025-02-04 17:37:12 +07:00
David Green
cae0d67cba
[AArch64][SDAG] Detect non-zeroes in truncating buildvectors in fshl lowering (#123597)
A BUILD_VECTOR can implicity shrink the bits of the operands if the
operand types are not legal. For example a v8i16 constant BUILD_VECTOR
might be represented as v8i16 BUILDVECTOR(i32 1, i32 2, ...).
Unfortunately this means that the constants are not accepted by
matchUnaryPredicateImpl, preventing in this case funnel shifts detecting
that all the operands are non-zero. Add a flag to help it match.
2025-02-03 10:47:45 +00:00
Craig Topper
d839e765f0 [TargetLowering] Inline the only caller of one of the forceExpandWideMUL functions. NFC
This caller does not need the libcall portion so it can directly
call forceExpandMultiply.
2025-01-27 17:10:37 -08:00
Craig Topper
4bcd8184a0
[TargetLowering] Pull similar code out of the forceExpandWideMUL into a helper. NFC (#124371)
These functions have similar code. One of them calculates the 2x width
full product from 2 sources. The other calculates the product from 2
sources that have low and high halves.

This patch introduces a new function that takes HiLHS and HiRHS as
optional values. If they are not null, they will be used in the
calculation of the Hi half. The Signed flag can only be set when
HiLHS/HiRHS are null.
2025-01-25 10:53:01 -08:00
Craig Topper
e30a4fc3e2
[TargetLowering] Improve one signature of forceExpandWideMUL. (#123991)
We have two forceExpandWideMUL functions. One takes the low and high
half of 2 inputs and calculates the low and high half of their product.
This does not calculate the full 2x width product.

The other signature takes 2 inputs and calculates the low and high half
of their full 2x width product. Previously it did this by sign/zero
extending the inputs to create the high bits and then calling the other
function.

We can instead copy the algorithm from the other function and use the
Signed flag to determine whether we should do SRA or SRL. This avoids
the need to multiply the high part of the inputs and add them to the
high half of the result. This improves the generated code for signed
multiplication.

This should improve the performance of #123262. I don't know yet how
close we will get to gcc.
2025-01-23 12:49:35 -08:00
Craig Topper
cdd321462a
[TargetLowering] Use getShiftAmountConstant. NFC (#123802)
Previously we always used the pointer size which might need to be
legalized on some targets.
2025-01-21 12:05:52 -08:00
Graham Hunter
d9f165ddea
[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)
Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder.

The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.
2025-01-20 12:57:05 +00:00
Craig Topper
e2449f1bce
[SelectionDAG] Use SDNode::op_iterator instead of SDNodeIterator. NFC (#122147)
I think SDNodeIterator primarily exists because GraphTraits requires an
iterator that dereferences to SDNode*. op_iterator dereferences to
SDUse* which is implicitly convertible to SDValue.

This piece of code can use SDValue instead of SDNode* so we should
prefer to use the the more common op_iterator.
2025-01-09 09:09:55 -08:00
Simon Pilgrim
112793a90e [DAG] expandUINT_TO_FP - use getShiftAmountConstant helper. NFC.
Don't bother with separate getShiftAmountTy/getConstant calls.
2025-01-06 18:49:50 +00:00
Alex MacLean
6018820c48
[NVPTX] Fix lowering of i1 SETCC (#115035)
Add DAG legalization support for expanding i1 SETCC nodes using
appropriate logical operations to simulate integer comparisons. Use
these expansions to handle i1 SETCC in NVPTX.

fixes #58428 and #57405
2024-12-05 12:54:24 -08:00
Simon Pilgrim
b1a48af56a
[DAG] SimplifyDemandedVectorElts - add handling for INT<->FP conversions (#117884) 2024-12-04 07:37:01 +00:00
Craig Topper
b076fbb844
[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587)
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
2024-12-03 22:06:55 -08:00
Craig Topper
caa8aa551b
[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574)
This is eventually passed to shouldSignExtendTypeInLibCall which calls
it IsSigned.
2024-12-03 18:25:44 -08:00
Félix-Antoine Constantin
7a56dc7245
[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741)
Fixes #111950
2024-11-22 17:28:45 -08:00
Simon Pilgrim
51809e4a26
[DAG] SimplifyDemandedVectorElts - add SimplifyMultipleUse handling to SEXT/ZEXT/TRUNC nodes (#116227)
Allows us to bypass multiple uses of a SEXT/ZEXT/TRUNC node operand
2024-11-16 12:40:42 +00:00
Sam Elliott
862f42eedf
[TargetLowering] Use Correct VT for Multi-out Asm (#116024)
This was overlooked in 7d940432c46be83b8fcb5dbefee439585fa820cd - when
inline assembly has multiple outputs, they are returned as members of a
struct, and the `getAsmOperandType` needs to be called for each member
of struct. The difference between this and the single-output case is
that in the latter, there isn't a struct wrapping the outputs.

I noticed this when trying to use the same mechanism in the RISC-V
backend.

Committing two tests:
- One that shows a crash before this change, which is fixed by this
change.
- One (commented out) that shows a different crash with tied
inputs/outputs. This is commented as it is not fixed by this change and
needs more work in target-independent inline asm handling code.
2024-11-14 12:31:31 +00:00
Alex MacLean
7a8fe0f83c
[SelectionDAG] Fixup type usage of CondCodeAction table (#116082)
Ensure that all uses of CondCodeAction table are checking the compared
types, not the produced type. This is a prerequisite to landing #115035
2024-11-13 13:20:16 -08:00
Yingwei Zheng
f74aed7938
[DAGCombiner] Add basic support for trunc nsw/nuw (#113808)
This patch adds basic support for `trunc nsw/nuw` in SDAG. It will allow
DAGCombiner to further eliminate in-reg `zext/sext` instructions.
2024-11-07 00:23:53 +08:00
Simon Pilgrim
9540a7ae82
[DAG] SimplifyMultipleUseDemandedBits - bypass ADD nodes if either operand is zero (#112588)
The dpbusd_const.ll test change is due to us losing the expanded add reduction pattern as one of the elements is known to be zero (removing one of the adds from the reduction pyramid). I don't think its of concern.

Noticed while working on #107423
2024-11-05 17:20:41 +00:00
Simon Pilgrim
0ac2e42227
[DAG] SimplifyDemandedBits - ignore SRL node if we're just demanding known sign bits (#114805)
Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node.

We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.

Same fold as #114389 which added this for SimplifyMultipleUseDemandedBits
2024-11-04 17:27:45 +00:00
Sander de Smalen
3098200fcc
[ISel] Propagate disjoint flag in ShrinkDemandedOp (#114560)
When trying to evaluate an expression in a narrower type, the
DAGCombine should propagate the disjoint flag, as it's equally
valid on the narrower expression.

This helps improve better use of addressing modes for some
Arm SME instructions, for example.
2024-11-03 19:42:04 +00:00
Kazu Hirata
6927a434ba
[SelectionDAG] Remove unused includes (NFC) (#114697)
Identified with misc-include-cleaner.
2024-11-03 07:03:33 -08:00
Simon Pilgrim
9fb4bc5bf4
[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389)
Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node.

We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.
2024-10-31 16:40:29 +00:00
Kazu Hirata
f582cd3dc7 [SelectionDAG] Fix a warning
This patch fixes:

  llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1489:17: error:
  unused variable 'Flags' [-Werror,-Wunused-variable]
2024-10-30 17:49:51 -07:00
Yingwei Zheng
cf9d1c1486
[SDAG] Simplify SDNodeFlags with bitwise logic (#114061)
This patch allows using enumeration values directly and simplifies the
implementation with bitwise logic. It addresses the comment in
https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-31 08:10:07 +08:00
Matt Arsenault
067e8b8dc5
DAG: Lower fcNormal is.fpclass to compare with inf (#100389) 2024-10-17 15:49:13 +04:00
Nikita Popov
255a99c29f
[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309)
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.

The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.

Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
2024-10-17 08:48:08 +02:00
Michael Marjieh
b5600c6f85
[TargetLowering][SelectionDAG] Exploit nneg Flag in UINT_TO_FP (#108931)
1. Propagate the nneg flag in WidenVecRes
2. Use SINT_TO_FP in expandUINT_TO_FP when possible.
2024-10-14 20:55:48 +04:00
YunQiang Su
d52c8408ff
SelectionDAG/expandFMINNUM_FMAXNUM: skips vector if SETCC/VSELECT is not legal (#109570)
If SETCC or VSELECT is not legal for vector, we should not expand it,
instead we can split the vectors.

So that, some simple scale instructions can be emitted instead of
some pairs of comparation+selection.
2024-10-10 08:39:25 +08:00