13873 Commits

Author SHA1 Message Date
Philip Reames
69edef1ab9 [DAG] Simplify control flow in SelectionDAGBuilder::visitShuffleVector [NFC]
If we've handled ==, and < above, the only case left can be >.  We don't
need to branch on this, and can instead assert and reduce indentation,
and simplify reasoning about the fallthrough path.
2024-11-01 08:59:15 -07:00
Antonio Frighetto
19c8475871 [SelectionDAG] Add preliminary plumbing for samesign flag
Extend recently-added poison-generating IR flag to codegen as well.
2024-10-31 19:47:50 +01:00
Simon Pilgrim
9fb4bc5bf4
[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389)
Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node.

We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.
2024-10-31 16:40:29 +00:00
Benjamin Maxwell
89a8c71db6
[SDAG] Support expanding FSINCOS to vector library calls (#114039)
This shares most of its code with the scalar sincos expansion. It allows
expanding vector FSINCOS nodes to a library call from the specified
`-vector-library`. The upside of this is it will mean the vectorizer
only needs to handle the sincos intrinsic, which has no memory effects,
and this can handle lowering the intrinsic to a call that takes output
pointers.
2024-10-31 12:41:43 +00:00
dnsampaio
28d0718033
[DAGCombiner] Add combine avg from shifts (#113909)
This teaches dagcombiner to fold:
`(asr (add nsw x, y), 1) -> (avgfloors x, y)`
`(lsr (add nuw x, y), 1) -> (avgflooru x, y)`

as well the combine them to a ceil variant:
`(avgfloors (add nsw x, y), 1) -> (avgceils x, y)` 
`(avgflooru (add nuw x, y), 1) -> (avgceilu x, y)`

iff valid for the target.

Removes some of the ARM MVE patterns that are now dead code.
It adds the avg opcodes to `IsQRMVEInstruction` as to preserve the
immediate splatting as before.
2024-10-31 10:57:27 +01:00
Craig Topper
00cbb68fb7 [LegalizeDAG] Use getSignedConstant. NFC 2024-10-30 21:43:16 -07:00
Kazu Hirata
f582cd3dc7 [SelectionDAG] Fix a warning
This patch fixes:

  llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1489:17: error:
  unused variable 'Flags' [-Werror,-Wunused-variable]
2024-10-30 17:49:51 -07:00
Yingwei Zheng
cf9d1c1486
[SDAG] Simplify SDNodeFlags with bitwise logic (#114061)
This patch allows using enumeration values directly and simplifies the
implementation with bitwise logic. It addresses the comment in
https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-31 08:10:07 +08:00
Simon Pilgrim
f7b5f0c805 [DAG] Fold (and X, (rot (not Y), Z)) -> (and X, (not (rot Y, Z)))
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Followup to #112547
2024-10-30 10:46:12 +00:00
Matt Arsenault
88e23eb2cf
DAG: Fix legalization of vector addrspacecasts (#113964) 2024-10-29 08:08:50 -05:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Ellis Hoag
6ab26eab4f
Check hasOptSize() in shouldOptimizeForSize() (#112626) 2024-10-28 09:45:03 -07:00
Simon Pilgrim
056cf936a7
[DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) (#112547)
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Fixes #112425
2024-10-28 11:52:44 +00:00
Dimitry Andric
4bce21480f
Ensure !NDEBUG with LLVM_ENABLE_ABI_BREAKING_CHECKS does not segfault (#113588)
In SelectionDAG, `TargetTransformInfo::hasBranchDivergence()` can be
called when both `NDEBUG` and `LLVM_ENABLE_ABI_BREAKING_CHECKS` are
enabled. In that case, the class member `TTI` is still initialized to
`nullptr`, causing a segfault.

Fix this by ensuring that all the calls to `hasBranchDivergence` and
`VerifyDAGDivergence` only occur when `NDEBUG` is disabled, and
`LLVM_ENABLE_ABI_BREAKING_CHECKS` is enabled.
2024-10-24 19:30:38 +02:00
James Chesterman
11c818816d
[AArch64] Improve index selection for histograms (#111150)
Removes unnecessary extends on the indices passed into histogram instructions. It also removes the instruction when the mask is zero.
2024-10-22 11:14:00 +01:00
Simon Pilgrim
f0b3b6d15b [DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710) (REAPPLIED)
Alter both isConstantIntBuildVectorOrConstantInt + isConstantFPBuildVectorOrConstantFP to return a bool instead of the underlying SDNode, and adjust usage to account for this.

Update isConstantIntBuildVectorOrConstantInt to peek though bitcasts when attempting to find a constant, in particular this improves canonicalization of constants to the RHS on commutable instructions.

X86 is the beneficiary here as it often bitcasts rematerializable 0/-1 vector constants as vXi32 and bitcasts to the requested type

Minor cleanup that helps with #107423

Reapplied after regression fix ba1255def64a9c3c68d97ace051eec76f546eeb0
2024-10-20 14:23:21 +01:00
Simon Pilgrim
ba1255def6 [DAG] Use FoldConstantArithmetic to constant fold (and (ext (and V, c1)), c2) -> (and (ext V), (and c1, (ext c2)))
Noticed while triaging the regression from #112710 noticed by @mstorsjo - don't rely on isConstantIntBuildVectorOrConstantInt+getNode to guarantee constant folding (if it fails to constant fold it will infinite loop), use FoldConstantArithmetic instead.
2024-10-20 13:05:23 +01:00
Martin Storsjö
b26df3e463 Revert "[DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710)"
This reverts commit a630771b28f4b252e2754776b8f3ab416133951a.

This caused compilation to hang for Windows/ARM, see
https://github.com/llvm/llvm-project/pull/112710 for details.
2024-10-20 00:49:16 +03:00
Simon Pilgrim
93ec08d629 [DAG] Move SIGN_EXTEND_INREG constant folding inside FoldConstantArithmetic
Update visitSIGN_EXTEND_INREG to call FoldConstantArithmetic instead of getNode.
2024-10-19 20:57:07 +01:00
Simon Pilgrim
e1330d96a0 [DAG] visitFMA/FDIV - avoid SDLoc duplication. NFC. 2024-10-18 11:57:41 +01:00
Simon Pilgrim
5c37316b54 [DAG] visitFMA/FMAD - use FoldConstantArithmetic to add missing vector constant folding support 2024-10-18 11:12:06 +01:00
Simon Pilgrim
a630771b28
[DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710)
Alter both isConstantIntBuildVectorOrConstantInt + isConstantFPBuildVectorOrConstantFP to return a bool instead of the underlying SDNode, and adjust usage to account for this.

Update isConstantIntBuildVectorOrConstantInt to peek though bitcasts when attempting to find a constant, in particular this improves canonicalization of constants to the RHS on commutable instructions.

X86 is the beneficiary here as it often bitcasts rematerializable 0/-1 vector constants as vXi32 and bitcasts to the requested type

Minor cleanup that helps with #107423
2024-10-18 10:52:55 +01:00
Simon Pilgrim
3ec1b1a4dd [DAG] visitFP_EXTEND - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.
2024-10-18 10:10:44 +01:00
Simon Pilgrim
3a1df05ca9 [DAG] visitFP_ROUND - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.
2024-10-18 10:10:43 +01:00
Simon Pilgrim
7a43be1690 [DAG] visitXROUND - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.
2024-10-18 10:10:43 +01:00
Simon Pilgrim
c72992bf89 [DAG] visitABS - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.

Cleanup for #112682
2024-10-18 10:10:43 +01:00
Keith Packard
44b020a381
[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928)
Add support for using a thread-local variable with a specified offset
for holding the stack guard canary value. This supports both 32- and 64-
bit PowerPC targets.

This mirrors changes from #108942 but targeting PowerPC instead of
RISCV. Because both of these PRs modify the same driver functions, this
series is stack on top of the RISC-V one.

---------

Signed-off-by: Keith Packard <keithp@keithp.com>
2024-10-17 19:06:47 -07:00
Simon Pilgrim
256bbdb3f6 [DAG] visitFCEIL/FTRUNC/FFLOOR/FNEG - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.

Cleanup for #112682
2024-10-17 16:53:44 +01:00
Simon Pilgrim
cf046c8717 [DAG] visitSIGN_EXTEND_INREG - avoid SDLoc duplication. NFC. 2024-10-17 12:51:11 +01:00
Simon Pilgrim
5692a0c6f8 [DAG] visitFP_TO_SINT/FP_TO_UINT - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.

Cleanup for #112682
2024-10-17 12:50:09 +01:00
Simon Pilgrim
784c15a282 [DAG] visitSINT_TO_FP/UINT_TO_FP - use FoldConstantArithmetic to attempt to constant fold
Don't rely on isConstantIntBuildVectorOrConstantInt followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.

Cleanup for #112682
2024-10-17 12:50:09 +01:00
Simon Pilgrim
8268bc48eb [DAG] Avoid SDLoc duplication in FP<->INT combines. NFC. 2024-10-17 12:50:09 +01:00
Matt Arsenault
067e8b8dc5
DAG: Lower fcNormal is.fpclass to compare with inf (#100389) 2024-10-17 15:49:13 +04:00
Nikita Popov
255a99c29f
[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309)
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.

The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.

Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
2024-10-17 08:48:08 +02:00
Tex Riddell
875afa939d
[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Based on example PR #96222 and fix PR #101268, with some differences due
to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp).

- Add llvm.experimental.constrained.atan2 - Intrinsics.td,
ConstrainedOps.def, LangRef.rst
- Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in
BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp
- Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp,
and LegalizeVectorTypes.cpp
- Update isKnownNeverNaN in SelectionDAG.cpp
- Update SelectionDAGDumper.cpp
- Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp
- TargetLoweringBase.cpp - Expand for vectors, promote f16
- X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC

Part 4 for Implement the atan2 HLSL Function #70096.
2024-10-16 11:43:17 -07:00
Lewis Crawford
f5f00764ab
[DAGCombiner] Fix check for extending loads (#112182)
Fix a check for extending loads in DAGCombiner,
where if the result type has more bits than the
loaded type it should count as an extending load.

All backends apart from AArch64 ignore this
ExtTy argument to shouldReduceLoadWidth, so this
change currently only impacts AArch64.
2024-10-16 13:23:46 +01:00
Simon Pilgrim
25b702f263 [DAG] visitXOR - add missing comment for or/and constant setcc demorgan fold. NFC.
Noticed while triaging #112347 which is using this fold - we described the or->and fold, but not the equivalent and->or which is also handled.
2024-10-16 11:15:36 +01:00
Simon Pilgrim
30deb76d46 [DAG] visitXOR - add missing comment for or/and constant demorgan fold. NFC.
Noticed while triaging #112347 which is using this fold - we described the or->and fold, but not the equivalent and->or which is also handled.
2024-10-15 16:32:27 +01:00
c8ef
854ded9b24
Reapply "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112203)
This patch adds icmp+select patterns for integer min/max matchers in
SDPatternMatch, similar to those in IR PatternMatch.

Reapply #111774.

Closes #108218.
2024-10-15 21:07:06 +08:00
Paul Walker
d27394abf0
[LLVM][SelectionDAG] Ensure Constant[FP]SDnode only store references to scalar Constant{Int,FP}. (#111005)
This fixes a failure path when the use-constant-##-for-###-splat IR
options are enabled.
2024-10-15 10:56:41 +01:00
Michael Marjieh
b5600c6f85
[TargetLowering][SelectionDAG] Exploit nneg Flag in UINT_TO_FP (#108931)
1. Propagate the nneg flag in WidenVecRes
2. Use SINT_TO_FP in expandUINT_TO_FP when possible.
2024-10-14 20:55:48 +04:00
c8ef
a3b0c31ebc
Revert "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112200)
Reverts llvm/llvm-project#111774

This appears to be causing some tests to fail.
2024-10-14 21:43:49 +08:00
c8ef
11f625cb87
[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes. (#111774)
Closes #108218.

This patch adds icmp+select patterns for integer min/max matchers in
SDPatternMatch, similar to those in IR PatternMatch.
2024-10-14 21:19:34 +08:00
duk
464a7ee79e
[CodeGen] Generalize trap emission after SP check fail (#109744)
Generalize and improve some target-specific code that emits traps after
stack protector failure in SelectionDAG & GlobalIsel.
2024-10-12 20:01:22 -04:00
Kazu Hirata
a62768c427
[CodeGen] Simplify code with *Map::operator[] (NFC) (#112075) 2024-10-11 23:01:21 -07:00
Oliver Stannard
1e49670b31
[DAGISel] Keep flags when converting FP load/store to integer (#111679)
This DAG combine replaces a floating-point load/store pair which has no
other uses with an integer one, but did not copy the memory operand
flags to the new instructions, resulting in it dropping the volatile
flag. This optimisation is still valid if one or both of the
instructions is volatile, so we can copy over the whole
MachineMemOperand to generate volatile integer loads and stores where
needed.
2024-10-10 09:17:50 +01:00
YunQiang Su
8d35ab80fc
AArch64: Add FMINNUM_IEEE and FMAXNUM_IEEE support (#107855)
FMINNM/FMAXNM instructions of AArch64 follow IEEE754-2008. We can use
them to canonicalize a floating point number. And
FMINNUM_IEEE/FMAXNUM_IEEE is used by something like expanding
FMINIMUMNUM/FMAXIMUMNUM, so let's define them.

Update combine_andor_with_cmps.ll.
Add fp-maximumnum-minimumnum.ll, with nnan testcases only.

V1F64 is not supported yet.
If we set v1f64 as legal, FMINNUM/FMAXNUM will have some problem:
   both of them use `if (isOperationLegalOrCustom(FMAXNUM_IEEE, VT))`.

AArch64 depends on `expandFMINNUM_FMAXNUM` returning `SDValue()`
for FMAXNUM and FMINNUM.

We should fix this problem, while it will be in future patch.
2024-10-10 15:09:47 +08:00
YunQiang Su
d52c8408ff
SelectionDAG/expandFMINNUM_FMAXNUM: skips vector if SETCC/VSELECT is not legal (#109570)
If SETCC or VSELECT is not legal for vector, we should not expand it,
instead we can split the vectors.

So that, some simple scale instructions can be emitted instead of
some pairs of comparation+selection.
2024-10-10 08:39:25 +08:00
Matt Arsenault
ced15cd418
DAG: Preserve more flags when expanding gep (#110815)
This allows selecting the addressing mode for stack instructions
in cases where we need to prove the sign bit is zero.
2024-10-09 13:51:52 +04:00
Simon Pilgrim
1dcb6dc757 [DAG] foldVSelectToSignBitSplatMask - pull out repeated code and use getShiftAmountConstant helper.
We're assuming shift amount type matches the result type - which is true for vectors, but I'm hoping to generalize this fold in the future.
2024-10-08 17:36:34 +01:00