1451 Commits

Author SHA1 Message Date
Jay Foad
1b761205f2
[APInt] Add a simpler overload of multiplicativeInverse (#87610)
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.

Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
2024-04-04 16:11:06 +01:00
Simon Pilgrim
2d0087424f
[DAG] Remove extract_vector_elt(freeze(x)), idx -> freeze(extract_vector_elt(x), idx) fold (#87480)
Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds.

This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements.

There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads?

Fixes #86968
2024-04-04 11:10:55 +01:00
aniplcc
d650fcd6bf
[DAG] SimplifyDemandedVectorElts - add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes (#86284)
Fixes #84768
2024-04-03 15:00:50 +01:00
Wang Pengcheng
610b9e23c5
[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505)
We can avoid libcalls.

Fixes #86205
2024-03-29 15:38:39 +08:00
Owen Anderson
7c9b5228da
Only check assertions that were meant to apply to the normal case of non-splat vector SREM expansion when we aren't hitting the special case. (#86238)
Fixes https://github.com/llvm/llvm-project/issues/84830
Introduced in https://github.com/llvm/llvm-project/pull/82706
2024-03-23 21:49:29 -05:00
Simon Pilgrim
e4fa2e3562
[DAG] isGuaranteedNotToBeUndefOrPoisonForTargetNode - add fallback implementation (#86125)
Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison.

Targets can still perform this themselves for specific special case nodes (e.g. target shuffles).

Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison
2024-03-21 15:11:59 +00:00
Benjamin Kramer
5f5a64134b Revert "[DAGCombiner] Simplifying {si|ui}tofp when only signbit is needed"
This reverts commit 353fbeb0a294d2c7cef6d88607fa0fd50ee81462. It crashes
when it encounters an UINT_TO_FP.

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1618 in SDValue llvm::SelectionDAG::getConstant(const ConstantInt &, const SDLoc &, EVT, bool, bool): VT.isInteger() && "Cannot create FP integer constant!"
2024-03-20 15:08:37 +01:00
Noah Goldstein
353fbeb0a2 [DAGCombiner] Simplifying {si|ui}tofp when only signbit is needed
If we only need the signbit `uitofp` simplified to 0, and `sitofp`
simplifies to `bitcast`.

Closes #85138
2024-03-19 17:17:35 -05:00
Craig Topper
23323e2837
[TargetLowering][RISCV] Propagate fastmath flags for the vector operations emitted in expandVecReduce. (#85164)
We used the fastmath flags for any scalar ops created, but not vector.
2024-03-14 08:39:32 -07:00
Arthur Eubanks
94c988bcfd [NFC] Remove unused parameter from shouldAssumeDSOLocal() 2024-03-11 19:48:17 +00:00
Noah Goldstein
61c06775c9 [KnownBits] Add API for nuw flag in computeForAddSub; NFC 2024-03-05 12:59:58 -06:00
Owen Anderson
2c5a68858b
Fix non-splat vector SREM expansion when one of the divisors is a power of two. (#82706)
The expansion previously used, derived from Hacker's Delight,
does not work correctly when the dividend is INT_MIN and the
divisor is a power of two. We now use an alternate derivation
of the A and Q constants specifically for the power-of-two divisor
case to avoid this problem. Credit to Fabian Giesen for the
new derivation.

Fixes https://github.com/llvm/llvm-project/issues/77169
2024-02-25 10:13:05 -05:00
David Majnemer
be36812fb7 [TargetLowering] Be more efficient in fp -> bf16 NaN conversions
We can avoid masking completely as it is OK (and probably preferable) to
bring over some of the existant NaN payload.
2024-02-21 22:47:27 +00:00
David Majnemer
9eff001d3d [TargetLowering] Correctly yield NaN from FP_TO_BF16
We didn't set the exponent field, resulting in tiny numbers instead of
NaNs.
2024-02-21 22:17:02 +00:00
David Majnemer
ddc0f1d8fe [TargetLowering] Actually add the adjustment to the significand
The logic was supposed to be choosing between {0, 1, -1} as an
adjustment to the FP bit pattern. However, the adjustment itself was
used as the bit pattern instead which result in garbage results.
2024-02-21 19:34:11 +00:00
David Majnemer
cc13f3ba45
Correctly round FP -> BF16 when SDAG expands such nodes (#82399)
We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the top 16 bits of a FP32 which will turn some NaNs into
infinities

Let's do this in a more principled way by rounding types with more
precision than FP32 to FP32 using round-inexact-to-odd which will negate
double rounding issues.
2024-02-21 12:37:02 -05:00
Craig Topper
d485317357
[TargetLowering] Emit SIGN_EXTEND_INREG instead of shift pair from optimizeSetCCOfSignedTruncationCheck. (#81785)
sext_inreg is our canonical form of shift pair before op legalization so
DAG combiner will probably create it anyway. If it isn't legal
LegalizeDAG will expand to shifts later.
2024-02-15 09:24:02 -08:00
David Green
2e3de997ab [DAG] Generalize setcc(setcc) fold to use known bits.
If we have a `SETCC (SETCC), 0, NE` and ZeroOrOneBooleanContent, we can remove
the outer setcc as it will produce the same value as the inner. This can be
generalized to anything where the top bits are known to be 0, as the value will
remain as 1 or 0.
2024-02-06 12:39:48 +00:00
Craig Topper
f72da9f4fd
[SelectionDAG] Use getShiftAmountConstant to simplify code. NFC (#80561)
Replace calls to getShiftAmountTy+getConstant with
getShiftAmountContant.
2024-02-04 16:05:14 -08:00
Kazu Hirata
39fa304866 [llvm] Use StringRef::starts_with (NFC) 2024-01-31 23:54:07 -08:00
PiJoules
a356e6ccad
[SelectionDAG] Expand fixed point multiplication into libcall (#79352)
32-bit ARMv6 with thumb doesn't support MULHS/MUL_LOHI as legal/custom
nodes during expansion which will cause fixed point multiplication of
_Accum types to fail with fixed point arithmetic. Prior to this, we just
happen to use fixed point multiplication on platforms that happen to
support these MULHS/MUL_LOHI.

This patch attempts to check if the multiplication can be done via
libcalls, which are provided by the arm runtime. These libcall attempts
are made elsewhere, so this patch refactors that libcall logic into its
own functions and the fixed point expansion calls and reuses that logic.
2024-01-30 13:58:55 -08:00
Philip Reames
0fc5f4b524
[DAG] Set nneg flag when forming zext in demanded bits (#72281)
We do the same for the analogous transform in DAGCombine, but this case
was missed in the recent patch which added support for zext nneg.

Sorry for the lack of test coverage. Not sure how to exercise this piece
of logic. It appears to have only minimal impact on LIT tests (only
test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll),
and even then, the changes without it appear uninteresting. Maybe we
should remove this transform instead?
2024-01-18 07:34:08 -08:00
Alex Bradbury
2d54ec36f7
[SelectionDAG] Add and use SDNode::getAsAPIntVal() helper (#77455)
This is the logical equivalent for #76710 for APInt and uses the same
naming scheme.

Converted existing users through:
`git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
2024-01-09 14:27:07 +00:00
Simon Pilgrim
d460c1de3b
[DAG] SimplifyDemandedBits - don't fold sext(x) -> aext(x) if we lose an 0/-1 allsignbits mask (#77296)
For targets that use 0/-1 boolean results, we want to keep this pattern through extensions/truncations as much as possible - so avoid simplifying to any_extend even if we don't demand the upper bits.

Noticed in triage for https://reviews.llvm.org/D152928
2024-01-08 18:01:41 +00:00
Simon Pilgrim
f45b75949d [DAG] SimplifyDemandedBits - call demanded elts variant directly for SELECT/SELECT_CC nodes.
Don't rebuild the demanded elts mask every time.
2024-01-04 10:53:45 +00:00
Simon Pilgrim
72db578d71 [DAG] Fix typo in VSELECT SimplifyDemandedVectorElts handling. NFC.
Rename UndefZero -> UndefSel (undefined elements from Sel operand).
2024-01-04 10:50:42 +00:00
David Green
771fd1ad2a
[DAG] Extend input types if needed in combineShiftToAVG. (#76791)
This atempts to fix #76734 which is a crash in invalid TRUNC nodes types
from unoptimized input code in combineShiftToAVG. The NVT can be VT if
the larger type was legal and the adds will not overflow, in which case
the inputs should be extended.

From what I can tell this appears to be valid (if not optimal for this
case): https://alive2.llvm.org/ce/z/fRieHR

The result has also been changed to getExtOrTrunc in case that VT==NVT,
which is not handled by SEXT/ZEXT.
2024-01-03 10:52:01 +00:00
Craig Topper
bbd57e1832
[SelectionDAG] Add initial plumbing for the disjoint flag. (#76751)
This copies the flag from IR to the SDNode in SelectionDAGBuilder, clears
the flag in SimplifyDemandedBits, and adds it to canCreateUndefOrPoison.

Uses of the flag will come in later patches.
2024-01-02 21:58:00 -08:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Simon Pilgrim
98efa8f9aa [DAG] Fix ShrinkDemandedOp doxygen description to match behaviour. NFC.
ShrinkDemandedOp checks for both isTruncateFree AND isZExtFree but extends with ANY_EXTEND.
2023-11-18 22:44:08 +00:00
Tavian Barnes
75cf672b12
[SDAG] Simplify is-power-of-2 codegen (#72275)
When x is not known to be nonzero, ctpop(x) == 1 is expanded to

    x != 0 && (x & (x - 1)) == 0

resulting in codegen like

    leal    -1(%rdi), %eax
    testl   %eax, %edi
    sete    %cl
    testl   %edi, %edi
    setne   %al
    andb    %cl, %al

But another expression that works is

    (x ^ (x - 1)) > x - 1

which has nicer codegen:

    leal    -1(%rdi), %eax
    xorl    %eax, %edi
    cmpl    %eax, %edi
    seta    %al
2023-11-15 22:26:34 +09:00
Yingwei Zheng
650026897c
[RISCV][SDAG] Prefer ShortForwardBranch to lower sdiv by pow2 (#67364)
This patch lowers `sdiv x, +/-2**k` to `add + select + shift` when the
short forward branch optimization is enabled. The latter inst seq
performs faster than the seq generated by target-independent
DAGCombiner. This algorithm is described in ***Hacker's Delight***.

This patch also removes duplicate logic in the X86 and AArch64 backend.
But we cannot do this for the PowerPC backend since it generates a
special instruction `addze`.
2023-11-10 21:38:47 +08:00
Craig Topper
70b35ec0a8
[SelectionDAG] Add initial support for nneg flag on ISD::ZERO_EXTEND. (#70872)
This adds the nneg flag to SDNodeFlags and the node printing code.
SelectionDAGBuilder will add this flag to the node if the target doesn't
prefer sign extend.

A future RISC-V patch can remove the sign extend preference from
SelectionDAGBuilder.

I've also added the flag to the DAG combine that converts
ISD::SIGN_EXTEND to ISD::ZERO_EXTEND.
2023-11-03 11:15:08 -07:00
Qiu Chaofan
b46e768455
[DAGCombine] Fold setcc_eq infinity into is.fpclass (#67829) 2023-11-01 11:51:15 +09:00
Simon Pilgrim
8d2efd7427 [DAG] Avoid ComputeNumSignBits call when we know the result is unsigned
D146121 needs to set the NSW flag, but given the result is NUW then we know that the result has leading zeros, so we don't need to call ComputeNumSignBits - just reuse the existing KnownBits value instead.
2023-10-29 17:35:24 +00:00
Simon Pilgrim
d96529af3c [DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED)
If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext.

Followup to D146121

Reapplied - moved after the ShrinkDemandedOp call; reuse the existing KnownBits result; ensure that we only attempt this if all the upper bits are demanded; 547dc461225ba should address the remaining regressions that were noticed in the previous commit.

Differential Revision: https://reviews.llvm.org/D155472
2023-10-29 15:38:46 +00:00
Simon Pilgrim
547dc46122 [DAG] SimplifyDemandedBits - ensure we drop NSW/NUW flags when we simplify a SHL node's input
We already do this for variable shifts, but we missed it for constant shifts

Fixes #69965
2023-10-26 10:34:58 +01:00
Simon Pilgrim
2a40ec2d3e [DAG] SimplifyDemandedBits - fix isOperationLegal typo in D146121
We need to check that the simplified ISD::SRL node is legal, not the old one

Noticed while trying to isolate the regressions in D155472
2023-10-17 17:50:12 +01:00
Kirill Stoimenov
0a776996af Revert "[DAG] Attempt shl narrowing in SimplifyDemandedBits"
This reverts commit 7a8c04ef84ecdab4390b451d4c2fe17bc45a7b63.
2023-10-04 22:15:41 +00:00
Simon Pilgrim
7a8c04ef84 [DAG] Attempt shl narrowing in SimplifyDemandedBits
If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext.

Followup to D146121

Differential Revision: https://reviews.llvm.org/D155472
2023-10-04 10:23:02 +01:00
Nick Desaulniers
e0a48c065b
[InlineAsm] add comments for NumOperands and ConstraintType (#67474)
Splitting up patches for #20571. I found these comments generally useful
to add and not predicated on those changes. Hopefully they help future
travelers.
2023-09-28 08:24:56 -07:00
Nick Desaulniers
35a364fa5c
[TargetLowering] fix index OOB (#67494)
I accidentally introduced this in

commit 330fa7d2a4e0 ("[TargetLowering] Deduplicate choosing InlineAsm
constraint between ISels (#67057)")

Fix forward.
2023-09-26 15:50:26 -07:00
Sam McCall
679c3a1791 [TargetLowering] use stable_sort to avoid nondeterminism
After 330fa7d2a4e0cfbb4b078 we were seeing nondeterministic failures of
llvm/test/CodeGen/ARM/thumb-big-stack.ll, with different code being
generated in different runs.

Switching sort -> stable_sort fixes this.
It looks like the old algorithm picked the first best option, and using
stable_sort restores that behavior.
2023-09-26 15:16:09 +02:00
Nick Desaulniers
330fa7d2a4
[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057)
Given a list of constraints for InlineAsm (ex. "imr") I'm looking to
modify the order in which they are chosen. Before doing so, I noticed a
fair
amount of logic is duplicated between SelectionDAGISel and GlobalISel
for this.

That is because SelectionDAGISel is also trying to lower immediates
during selection. If we detangle these concerns into:
1. choose the preferred constraint
2. attempt to lower that constraint

Then we can slide down the list of constraints until we find one that
can be lowered. That allows the implementation to be shared between
instruction selection frameworks.

This makes it so that later I might only need to adjust the priority of
constraints in one place, and have both selectors behave the same.
2023-09-25 08:53:03 -07:00
Sirish Pande
e6f9483f77
[SelectionDAG] Flags are dropped when creating a new FMUL (#66701)
While simplifying some vector operators in DAG combine, we may need to
create new instructions for simplified vectors. At that time, we need to
make sure that all the flags of the new instruction are copied/modified
from the old instruction.

If "contract" is dropped from an instruction like FMUL, it may not
generate FMA instruction which would impact performance.

Here's an example where "contract" flag is dropped when FMUL is created.

Replacing.2 t42: v2f32 = fmul contract t41, t38
With: t48: v2f32 = fmul t38, t38

Co-authored-by: Sirish Pande <sirish.pande@amd.com>
2023-09-21 10:26:34 -05:00
Craig Topper
8f04d81ede [SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore.
If the SRL for Hi constant folds, but we don't remoe those bits from
the Lo, we can end up with strange constant folding through DAGCombine later.
I've only seen this with constants being lowered to constant pools
during lowering on RISC-V.
2023-09-18 09:10:19 -07:00
Yingwei Zheng
e042ff7eef
[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb
This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb.
Clang vs gcc: https://godbolt.org/z/rc3s4hjPh

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D156390
2023-09-17 02:56:09 +08:00
Kazu Hirata
5fb990ac51 [SelectionDAG] Use isNullConstant (NFC) 2023-09-02 09:32:43 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Simon Pilgrim
2a81396b1b [DAG] SimplifyDemandedBits - add SMIN/SMAX KnownBits comparison analysis
Followup to D158364

Also, final fix for Issue #59902 which noted that the snippet should just return 1
2023-09-01 12:42:30 +01:00