1623 Commits

Author SHA1 Message Date
Matt Arsenault
a757c4e74e
CodeGen: Add subtarget to TargetLoweringBase constructor (#168620)
Currently LibcallLoweringInfo is defined inside of TargetLowering,
which is owned by the subtarget. Pass in the subtarget so we can
construct LibcallLoweringInfo with the subtarget. This is a temporary
step that should be revertable in the future, after LibcallLoweringInfo
is moved out of TargetLowering.
2025-11-19 19:18:13 +00:00
Matt Arsenault
c5aace4236
DAG: Move expandMultipleResultFPLibCall to TargetLowering (NFC) (#166988)
This kind of helper is higher level and not general enough to go
directly in SelectionDAG. Most similar utilities are in TargetLowering.
2025-11-12 03:50:33 +00:00
Damian Heaton
70f4b596cf
Add llvm.vector.partial.reduce.fadd intrinsic (#159776)
With this intrinsic, and supporting SelectionDAG nodes, we can better
make use of instructions such as AArch64's `FDOT`.
2025-11-07 15:36:54 +00:00
Fabian Ritter
8ea447b4c4
[SDAG] Set InBounds when when computing offsets into memory objects (#165425)
When a load or store accesses N bytes starting from a pointer P, and we want to
compute an offset pointer within these N bytes after P, we know that the
arithmetic to add the offset must be inbounds. This is for example relevant
when legalizing too-wide memory accesses, when lowering memcpy&Co., or when
optimizing "vector-load -> extractelement" into an offset load.

For SWDEV-516125.
2025-10-31 11:27:55 +01:00
AZero13
5d0f1591f8
[DAGCombine] Improve bswap lowering for machines that support bit rotates (#164848)
Source: Hacker's delight.
2025-10-25 10:17:15 -07:00
Sam Parker
1820102167
Wasm fmuladd relaxed (#163177)
Reland #161355, after fixing up the cross-projects-tests for the wasm
simd intrinsics.

Original commit message:
Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions.
If we have FP16, then lower v8f16 fmuladds to FMA.

I've introduced an ISD node for fmuladd to maintain the rounding
ambiguity through legalization / combine / isel.
2025-10-13 16:50:53 +01:00
Sam Parker
30d3441cf0
Revert "[WebAssembly] Lower fmuladd to madd and nmadd" (#163171)
Reverts llvm/llvm-project#161355

Looks like I've broken some intrinsic code generation.
2025-10-13 11:53:40 +01:00
Sam Parker
a4eb7ea225
[WebAssembly] Lower fmuladd to madd and nmadd (#161355)
Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions.
If we have FP16, then lower v8f16 fmuladds to FMA.

I've introduced an ISD node for fmuladd to maintain the rounding
ambiguity through legalization / combine / isel.
2025-10-13 10:36:08 +01:00
paperchalice
b0a755b2bf
[TargetLowering] Remove NoSignedZerosFPMath uses (#160975)
Remove NoSignedZerosFPMath in TargetLowering part, users should always
use instruction level fast math flags.
2025-09-29 14:33:56 +08:00
Lewis Crawford
a27baf9c96
[SelectionDAG] Improve v2f16 maximumnum expansion (#160723)
On targets where f32 maximumnum is legal, but maximumnum on vectors of
smaller types is not legal (e.g. v2f16), try unrolling the vector first
as part of the expansion.

Only fall back to expanding the full maximumnum computation into
compares + selects if maximumnum on the scalar element type cannot be
supported.
2025-09-26 11:37:29 +01:00
AZero13
151a80bbce
[TargetLowering][ExpandABD] Prefer selects over usubo if we do the same for ucmp (#159889)
Same deal we use for determining ucmp vs scmp.

Using selects on platforms that like selects is better than using usubo.

Rename function to be more general fitting this new description.
2025-09-25 10:33:05 +09:00
Craig Topper
ef1372af43
[KnownBits] Add setAllConflict to set all bits in Zero and One. NFC (#159815)
This is a common pattern to initialize Knownbits that occurs before
loops that call intersectWith.
2025-09-19 13:15:54 -07:00
Fabian Ritter
a2dcc88f39
[AMDGPU][SDAG] Handle ISD::PTRADD in various special cases (#145330)
There are more places in SIISelLowering.cpp and AMDGPUISelDAGToDAG.cpp
that check for ISD::ADD in a pointer context, but as far as I can tell
those are only relevant for 32-bit pointer arithmetic (like frame
indices/scratch addresses and LDS), for which we don't enable PTRADD
generation yet.

For SWDEV-516125.
2025-09-19 10:19:38 +02:00
Björn Pettersson
1c4c7bd808
[SelectionDAG] Deal with POISON for INSERT_VECTOR_ELT/INSERT_SUBVECTOR (#143102)
As reported in https://github.com/llvm/llvm-project/issues/141034
SelectionDAG::getNode had some unexpected
behaviors when trying to create vectors with UNDEF elements. Since
we treat both UNDEF and POISON as undefined (when using isUndef())
we can't just fold away INSERT_VECTOR_ELT/INSERT_SUBVECTOR based on
isUndef(), as that could make the resulting vector more poisonous.

Same kind of bug existed in DAGCombiner::visitINSERT_SUBVECTOR.

Here are some examples:

This fold was done even if vec[idx] was POISON:
  INSERT_VECTOR_ELT vec, UNDEF, idx -> vec

This fold was done even if any of vec[idx..idx+size] was POISON:
  INSERT_SUBVECTOR vec, UNDEF, idx -> vec

This fold was done even if the elements not extracted from vec could
be POISON:
  sub = EXTRACT_SUBVECTOR vec, idx
  INSERT_SUBVECTOR UNDEF, sub, idx -> vec

With this patch we avoid such folds unless we can prove that the
result isn't more poisonous when eliminating the insert.

Fixes https://github.com/llvm/llvm-project/issues/141034
2025-09-17 21:04:00 +00:00
Björn Pettersson
593f24cac6
[SelectionDAG] Clean up SCALAR_TO_VECTOR handling in SimplifyDemandedVectorElts (#157027)
This patch reverts changes from commit 585e65d3307f5f0
(https://reviews.llvm.org/D104250), as it doesn't seem to be needed
nowadays.

The removed code was doing a recursive call to
SimplifyDemandedVectorElts trying to simplify the vector %vec when
finding things like
  (SCALAR_TO_VECTOR (EXTRACT_VECTOR_ELT %vec, 0))

I figure that (EXTRACT_VECTOR_ELT %vec, 0) would be simplified based on
only demanding element zero regardless of being used in a
SCALAR_TO_VECTOR operation or not.

It had been different if the code tried to simplify the whole expression
as %vec. That could also have motivate why to make element zero a
special case. But it only simplified %vec without folding away the
SCALAR_TO_VECTOR.
2025-09-05 15:08:49 +02:00
Craig Topper
c65d6cb0a1
[SelectionDAG] Return std::optional<unsigned> from getValidShiftAmount and friends. NFC (#156224)
Instead of std::optional<uint64_t>. Shift amounts must be less than or
equal to our maximum supported bit widths which fit in unsigned. Most of
the callers already assumed it fit in unsigned.
2025-08-31 11:29:07 -07:00
AZero13
2b4fff6521
[TargetLowering] Only freeze LHS and RHS if they are used multiple times in expandABD (#156193)
Not all paths in expandABD are using LHS and RHS twice.
2025-08-31 10:30:29 +00:00
Craig Topper
cec0d5b217
[ValueTracking][SelectionDAG] Use KnownBits::reverseBits/byteSwap. NFC (#155847) 2025-08-28 13:26:32 -07:00
Craig Topper
9472225fa6
[KnownBits] Add operator<<=(unsigned) and operator>>=(unsigned). NFC (#155751)
Add operators to shift left or right and insert unknown bits.
2025-08-28 10:08:28 -07:00
Nikita Popov
238c3dcd0d
[CodeGen][Mips] Remove fp128 libcall list (#153798)
Mips requires fp128 args/returns to be passed differently than i128. It
handles this by inspecting the pre-legalization type. However, for soft
float libcalls, the original type is currently not provided (it will
look like a i128 call). To work around that, MIPS maintains a list of
libcalls working on fp128.

This patch removes that list by providing the original, pre-softening
type to calling convention lowering. This is done by carrying additional
information in CallLoweringInfo, as we unfortunately do need both types
(we want the un-softened type for OrigTy, but we need the softened type
for the actual register assignment etc.)

This is in preparation for completely removing all the custom
pre-analysis code in the Mips backend and replacing it with use of
OrigTy.
2025-08-18 09:22:41 +02:00
Nikita Popov
01bc742185
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
2025-08-15 18:06:07 +02:00
Nikita Popov
d1952baa5d [CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC)
It does not make sense to set the softening type list without
setting IsSoften=true.
2025-08-14 10:04:56 +02:00
Yingwei Zheng
62735d26b1
[DAGCombine] Correctly extend the constant RHS in TargetLowering::SimplifySetCC (#152862)
In https://github.com/llvm/llvm-project/pull/150270, when the predicate
is eq/ne and the trunc has only an nsw flag, the RHS is incorrectly
zero-extended.

Closes https://github.com/llvm/llvm-project/issues/152630.
2025-08-10 01:24:37 +08:00
Alex MacLean
d27802a217
[DAGCombiner] Fold setcc of trunc, generalizing some NVPTX isel logic (#150270)
That change adds support for folding a SETCC when one or both of the
operands is a TRUNCATE with the appropriate no-wrap flags. This pattern
can occur when promoting i8 operations in NVPTX, and we currently have
some ISel rules to try to handle it.
2025-08-05 19:20:17 -07:00
Simon Pilgrim
d561259a08
[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with just the frozen node (#150017)
Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we
merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of
hasOneUse() problems when trying to push FREEZE nodes through the DAG.

Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we
now want to keep the common node, and not bypass for some nodes just
because of DemandedElts.

Fixes #149799
2025-08-05 09:24:09 +01:00
Craig Topper
a3a8e1c064
[TargetLowering][RISCV] Use sra for (X & -256) == 256 -> (X >> 8) == 1 if it yields a better icmp constant. (#151762)
If using srl does not produce a legal constant for the RHS of the
final compare, try to use sra instead.
    
Because the AND constant is negative, the sign bits participate in the
compare. Using an arithmetic shift right duplicates that bit.
2025-08-04 09:00:41 -07:00
Craig Topper
f952a84f2f [TargetLowering] Use getShiftAmountConstant in buildSDIVPow2WithCMov. 2025-08-02 10:50:46 -07:00
Craig Topper
eddd34227e [TargetLowering] Use getShiftAmountConstant in CTTZTableLookup. NFC 2025-07-29 22:43:42 -07:00
Simon Pilgrim
c710d460a5
[DAG] expandVECTOR_COMPRESS - remove superfluous getFreeze. NFC. (#150062)
freeze(freeze(extract_vector_elt(x,i))) -> freeze(extract_vector_elt(x,i))
2025-07-22 18:37:12 +01:00
Craig Topper
8d549cf036
[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)
getNode updates flags correctly for CSE. Calling setFlags after getNode
may set the flags where they don't apply.

I've added a Flags argument to getSelectCC and the signature of getNode that takes
an ArrayRef of EVTs.
2025-07-22 08:06:30 -07:00
Simon Pilgrim
4b0625f051 [DAG] isNonZeroModBitWidthOrUndef - fix bugprone-argument-comment analyzer warning. NFC.
matchUnaryPredicate argument is AllowUndefs not AllowUndef
2025-07-22 10:36:59 +01:00
Simon Pilgrim
17c7c2ebe8
[DAG] Add missing Depth argument to isGuaranteedNotToBeUndefOrPoison calls inside SimplifyDemanded methods (#149550)
Ensure we don't exceed the maximum recursion depth
2025-07-20 13:06:55 +01:00
Fraser Cormack
a516c60ec3
[NFC] Correct typo: invertion -> inversion (#147995) 2025-07-11 07:37:25 +01:00
Boyao Wang
697beb3f17
[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664)
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.

Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
2025-07-10 11:11:09 +08:00
Matt Arsenault
dc69b00b0a
RuntimeLibcalls: Remove table of soft float compare cond codes (#146082)
Previously we had a table of entries for every Libcall for
the comparison to use against an integer 0 if it was a soft
float compare function. This was only relevant to a handful of
opcodes, so it was wasteful. Now that we can distinguish the
abstract libcall for the compare with the concrete implementation,
we can just directly hardcode the comparison against the libcall
impl without this configuration system.
2025-07-09 17:13:58 +09:00
Matt Arsenault
d8ef156379
DAG: Remove verifyReturnAddressArgumentIsConstant (#147240)
The intrinsic argument is already marked with immarg so non-constant
values are rejected by the IR verifier.
2025-07-07 16:28:47 +09:00
AZero13
91cc33f321
[TargetLowering] hasAndNotCompare should be checking for X, not Y (#146935)
Y is the one being bitwise-not, so it should not be passed, as the other
one should be passed instead.
2025-07-07 14:54:29 +09:00
AZero13
dcea5f1f38
[TargetLowering] Fold (a | b) ==/!= b -> (a & ~b) ==/!= 0 when and-not exists (#145368)
This is especially helpful for AArch64, which simplifies ands + cmp to tst.
Alive2: https://alive2.llvm.org/ce/z/LLgcJJ

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-06-27 14:47:52 +01:00
Matt Arsenault
7255c3aee3
DAG: Check libcall function is supported before emission (#144314) 2025-06-27 18:09:04 +09:00
Björn Pettersson
fd3cc204de
[SelectionDAG] Fold undemanded operand to UNDEF for VECTOR_SHUFFLE (#145524)
Always let SimplifyDemandedVectorElts fold either side of a
VECTOR_SHUFFLE to UNDEF if no elements are demanded from that side.

For a single use this could be done by SimplifyDemandedVectorElts
already, but in case the operand had multiple uses we did not eliminate
the use.
2025-06-25 16:05:54 +02:00
Iris Shi
f2eb5d416e
[SelectionDAG] Handle fneg/fabs/fcopysign in SimplifyDemandedBits (#139239) 2025-06-22 22:48:59 +08:00
Paul Walker
68732ce8e0
[LLVM][CodeGen][SVE] Add isel for bfloat unordered reductions. (#143540)
The omissions are VECREDUCE_SEQ_* and MUL. The former goes down a
different code path and the latter is unsupported across all element types.
2025-06-20 11:46:25 +01:00
Matt Arsenault
97bfb936af
DAG: Move soft float predicate management into RuntimeLibcalls (#142905)
Work towards making RuntimeLibcalls the centralized location for
all libcall information. This requires changing the encoding from
tracking the ISD::CondCode to using CmpInst::Predicate.
2025-06-17 09:42:53 +09:00
Matt Arsenault
505c550e4c
DAG: Assert fcmp uno runtime calls are boolean values (#142898)
This saves 2 instructions in the ARM soft float case for fcmp ueq.

This code is written in an confusingly overly general way. The point
of getCmpLibcallCC is to express that the compiler-rt implementations
of the FP compares are different aliases around functions which may
return -1 in some cases. This does not apply to the call for unordered,
which returns a normal boolean.

Also stop overriding the default value for the unordered compare for ARM.
This was setting it to the same value as the default, which is now assumed.
2025-06-10 10:46:29 +09:00
Philip Reames
939666380f
[SDAG] Add partial_reduce_sumla node (#141267)
We have recently added the partial_reduce_smla and partial_reduce_umla
nodes to represent Acc += ext(b) * ext(b) where the two extends have to
have the same source type, and have the same extend kind.

For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which
correspond to the existing nodes, but we also have vqdotsu which
represents the case where the two extends are sign and zero respective
(i.e. not the same type of extend).

This patch adds a partial_reduce_sumla node which has sign extension for
A, and zero extension for B. The addition is somewhat mechanical.
2025-06-09 07:17:45 -07:00
Nikita Popov
d74831efeb Revert "[SDAG] Fix fmaximum legalization errors (#142170)"
This reverts commit 58cc1675ec7b4aa5bc2dab56180cb7af1b23ade5.

I also made the incorrect assumption that we know both values are
+/-0.0 here as well. Revert for now.
2025-06-04 14:35:30 +02:00
Nikita Popov
42605b8aa3 Revert "[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732)"
This reverts commit 54da543a14da6dd0e594875241494949cb659b08.

I made a logic error here with the assumption that both values
are known to be +/-0.0.
2025-06-04 14:22:19 +02:00
Nikita Popov
54da543a14
[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732)
When ordering signed zero, only check the sign of one of the values. We
already know at this point that both values must be +/-0.0, so it is
sufficient to check one of them to correctly order them.

For example, for fmaximum, if we know LHS is `+0.0` then we can always
select LHS, value of RHS does not matter. If LHS is `-0.0` we can always
select RHS, value of RHS doesn't matter.
2025-06-04 10:41:30 +02:00
YunQiang Su
bd831372b2
expandFMINIMUMNUM_FMAXIMUMNUM: Quiet is not needed for NaN vs NaN (#139237)
New LangRef doesn't requires quieting for NaN vs NaN, aka the result may
be sNaN for sNaN vs NaN.
See: https://github.com/llvm/llvm-project/pull/139228
2025-06-04 08:20:48 +08:00
Nikita Popov
58cc1675ec
[SDAG] Fix fmaximum legalization errors (#142170)
FMAXIMUM is currently legalized via IS_FPCLASS for the signed zero
handling. This is problematic, because it assumes the equivalent integer
type is legal. Many targets have legal fp128, but illegal i128, so this
results in legalization failures.

Fix this by replacing IS_FPCLASS with checking the bitcast to integer
instead. In that case it is sufficient to use any legal integer type, as
we're just interested in the sign bit. This can be obtained via a stack
temporary cast. There is existing FloatSignAsInt functionality used for
legalization of FABS and similar we can use for this purpose.

Fixes https://github.com/llvm/llvm-project/issues/139380.
Fixes https://github.com/llvm/llvm-project/issues/139381.
Fixes https://github.com/llvm/llvm-project/issues/140445.
2025-06-02 10:14:33 +02:00