1709 Commits

Author SHA1 Message Date
Nikita Popov
e9e7b2adcf
[SDAG] Set IsPostTypeLegalization flag in LegalizeDAG (#122278)
This runs after type legalization and as such should set
IsPostTypeLegalization when creating libcalls. I don't think this makes
any observable difference right now, but I ran into this issue in an
upcoming patch.
2025-01-10 12:25:36 +01:00
Craig Topper
104ad9258a
[SelectionDAG] Rename SDNode::uses() to users(). (#120499)
This function is most often used in range based loops or algorithms
where the iterator is implicitly dereferenced. The dereference returns
an SDNode * of the user rather than SDUse * so users() is a better name.

I've long beeen annoyed that we can't write a range based loop over
SDUse when we need getOperandNo. I plan to rename use_iterator to
user_iterator and add a use_iterator that returns SDUse& on dereference.
This will make it more like IR.
2024-12-18 20:09:33 -08:00
Craig Topper
b076fbb844
[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587)
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
2024-12-03 22:06:55 -08:00
Craig Topper
caa8aa551b
[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574)
This is eventually passed to shouldSignExtendTypeInLibCall which calls
it IsSigned.
2024-12-03 18:25:44 -08:00
Craig Topper
bc282605df
[SelectionDAG] Require last operand of (STRICT_)FP_ROUND to be a TargetConstant. (#117639)
Fix all the places I could find that did't do this. We were already
mostly correct for FP_ROUND after
9a976f36615dbe15e76c12b22f711b2e597a8e51, but not STRICT_FP_ROUND.
2024-11-25 21:36:33 -08:00
Craig Topper
c2bb056482
[SelectionDAG][RISCV][AArch64] Allow f16 STRICT_FLDEXP to be promoted. Fix integer promotion of STRICT_FLDEXP in type legalizer. (#117633)
A special case in type legalization wasn't accounting for different
operand numbering between FLDEXP and STRICT_FLDEXP.

AArch64 already asked STRICT_FLDEXP to be promoted, but had no test for
it.
2024-11-25 16:12:45 -08:00
Nikita Popov
3317c9ceac
[AMDGPU] Use getSignedConstant() where necessary (#117328)
Create signed constant using getSignedConstant(), to avoid future
assertion failures when we disable implicit truncation in getConstant().

This also touches some generic legalization code, which apparently only
AMDGPU tests.
2024-11-25 09:49:34 +01:00
Benjamin Maxwell
ea6b8fa4b9
[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792)
This merges the logic for expanding both FFREXP and FSINCOS into one
method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication
and also allows FFREXP to benefit from the stack slot elimination
implemented for FSINCOS. This method will also be used in future to
implement more multiple-result intrinsics (such as modf and sincospi).
2024-11-06 11:06:06 +00:00
Benjamin Maxwell
89a8c71db6
[SDAG] Support expanding FSINCOS to vector library calls (#114039)
This shares most of its code with the scalar sincos expansion. It allows
expanding vector FSINCOS nodes to a library call from the specified
`-vector-library`. The upside of this is it will mean the vectorizer
only needs to handle the sincos intrinsic, which has no memory effects,
and this can handle lowering the intrinsic to a call that takes output
pointers.
2024-10-31 12:41:43 +00:00
Craig Topper
00cbb68fb7 [LegalizeDAG] Use getSignedConstant. NFC 2024-10-30 21:43:16 -07:00
Yingwei Zheng
cf9d1c1486
[SDAG] Simplify SDNodeFlags with bitwise logic (#114061)
This patch allows using enumeration values directly and simplifies the
implementation with bitwise logic. It addresses the comment in
https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-31 08:10:07 +08:00
Matt Arsenault
88e23eb2cf
DAG: Fix legalization of vector addrspacecasts (#113964) 2024-10-29 08:08:50 -05:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Tex Riddell
875afa939d
[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Based on example PR #96222 and fix PR #101268, with some differences due
to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp).

- Add llvm.experimental.constrained.atan2 - Intrinsics.td,
ConstrainedOps.def, LangRef.rst
- Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in
BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp
- Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp,
and LegalizeVectorTypes.cpp
- Update isKnownNeverNaN in SelectionDAG.cpp
- Update SelectionDAGDumper.cpp
- Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp
- TargetLoweringBase.cpp - Expand for vectors, promote f16
- X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC

Part 4 for Implement the atan2 HLSL Function #70096.
2024-10-16 11:43:17 -07:00
Ralf Jung
29ec0716a8
Fix comment typo in ExpandFCOPYSIGN (#111489)
I noticed this while following
https://github.com/llvm/llvm-project/pull/111269. It makes little sense
that FCOPYSIGN would look at the sign of `x`, right? Surely this must be
`y`. Also fix the inconsistency where it's sometimes `x` and sometimes
`X`.
2024-10-08 12:47:56 +04:00
Luke Lau
487686b82e
[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000)
In https://reviews.llvm.org/D153848, promotion was added for a variety
of f16 ops with zvfhmin, including VP reductions.

However I don't believe it's correct to promote f16 fadd or fmul
reductions to f32 since we need to round the intermediate results.

Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two
different results depending on whether we compiled with +zvfh or
+zvfhmin, for example with a 3 element reduction:

	; v9 = [0.1563, 5.97e-8, 0.00006104]

	; zvfh
	vsetivli x0, 3, e16, m1, ta, ma
	vmv.v.i v8, 0
	vfredosum.vs v8, v9, v8
	vfmv.f.s fa0, v8
	; fa0 = 0.1563

	; zvfhmin
	vsetivli x0, 3, e16, m1, ta, ma
	vfwcvt.f.f.v v10, v9
	vsetivli x0, 3, e32, m1, ta, ma
	vmv.v.i v8, 0
	vfredosum.vs v8, v10, v8
	vfmv.f.s fa0, v8
	fcvt.h.s fa0, fa0
	; fa0 = 0.1564

This same thing happens with reassociative reductions e.g. vfredusum.vs,
and this also applies for bf16.

I couldn't find anything in the LangRef for reductions that suggest the
excess precision is allowed. There may be something we can do in Clang
with -fexcess-precision=fast, but I haven't looked into this yet.

I presume the same precision issue occurs with fmul, but not with
fmin/fmax/fminimum/fmaximum.

I can't think of another way of lowering these other than scalarizing,
and we can't scalarize scalable vectors, so this just removes the
promotion and adjusts the cost model to return an invalid cost. (It
looks like we also don't currently cost fmul reductions, so presumably
they also have an invalid cost?)

I think this should be enough to stop the loop vectorizer or SLP from
emitting these intrinsics.
2024-10-04 00:17:45 +08:00
Timothy Pearson
90c1474863
[SDAG] Honor signed arguments in floating point libcalls (#109134)
In ExpandFPLibCall, an assumption is made that all floating point
libcalls that take integer arguments use unsigned integers. In the case
of ldexp and frexp, this assumption is incorrect, leading to
miscompilation and subsequent target-dependent incorrect operation.

Indicate that ldexp and frexp utilize signed arguments in
ExpandFPLibCall.

Fixes #108904

Signed-off-by: Timothy Pearson <tpearson@solidsilicon.com>
2024-09-25 11:09:50 +04:00
Benjamin Maxwell
3073c3c229
[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)
When lowering `FSINCOS` to a library call (that takes output pointers)
we can avoid creating new stack allocations if the results of the
`FSINCOS` are being stored. Instead, we can take the destination
pointers from the stores and pass those to the library call.

---

Note: As a NFC this also adds (and uses) `RTLIB::getFSINCOS()`.
2024-09-24 13:36:21 +01:00
Max Beck-Jones
1693d8eb9a
[AArch64][SelectionDAG] Vector splitting and promotion for histogram intrinsic (#103037)
Adds support for wider-than-legal vector types for the histogram
intrinsic (llvm.experimental.vector.histogram.add) by splitting the
vector. Also adds integer promotion for the Inc operand.
2024-08-30 08:54:12 +01:00
Craig Topper
f7d94b783f [SelectionDAG] Use getAllOnesConstant. 2024-08-17 17:57:05 -07:00
Craig Topper
7afb51e035
[SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few places. (#104555)
PR #80309 proposes to have users of APInt's uint64_t
constructor opt-in to implicit truncation. Currently, that patch
requires SelectionDAG::getConstant to opt-in.

This patch adds getSignedConstant so we can start fixing some of the
cases that require implicit truncation.
2024-08-16 09:21:11 -07:00
YunQiang Su
fb9e685fc4
Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)
C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IEEE754-2019. Let's
introduce new intrinsics to support them.

This patch introduces support only support for scalar values. The
support of
  vector (vp, vp.reduce, vector.reduce),
  experimental.constrained
will be added in future patches.

With this patch, MIPSr6 and LoongArch can work out of box with
fcanonical and fmax/fmin.

Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while
they have no fcanonical support yet.
I will add it in future patches.

The FMIN/FMAX of RISC-V instructions follows the
minimumNumber/maximumNumber of IEEE754-2019. We can just add it in
future patch.

Background

https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735
Currently we have fminnum/fmaxnum, which have different behavior on
different platform for NUM vs sNaN:
   1) Fallback to fmin(3)/fmax(3): return qNaN.
   2) ARM64/ARM32+Neon: same as libc.
   3) MIPSr6/LoongArch/RISC-V: return NUM.

And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008
will submit as separated patches.
2024-08-15 14:09:36 +08:00
Craig Topper
51bad732dc [SelectionDAG] Replace EVTToAPFloatSemantics with MVT/EVT::getFltSemantics. (#103001) 2024-08-13 11:35:28 -07:00
Kazu Hirata
f4fb735840
[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578) 2024-08-09 09:15:42 -07:00
Manish Kausik H
259742a885
[SelectionDAG] Use unaligned store/load to move AVX registers onto stack for insertelement (#82130)
Prior to this patch, SelectionDAG generated aligned move onto stacks for
AVX registers when the function was marked as a no-realign-stack
function. This lead to misalignment between the stack and the
instruction generated. This patch fixes the issue. There was a similar
issue reported for `extractelement` which was fixed in
a6614ec5b7c1dbfc4b847884c5de780cf75e8e9c

Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
2024-08-09 15:39:54 +01:00
Sergei Barannikov
4527fba9ad
Revert "[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall" (#101740)
Reverts the rest of llvm/llvm-project#99752
2024-08-03 01:51:26 +03:00
Sergei Barannikov
92e18ffd80
[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#99752)
The main change is adding CTPOP to `RuntimeLibcalls.def` to allow
targets to use LibCall action for CTPOP. DAG legalizers are changed
accordingly.
2024-08-02 12:29:39 +03:00
Sumanth Gundapaneni
0ee32c4573
[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)
This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AMDGPU.
2024-07-24 23:34:31 +04:00
Sumanth Gundapaneni
fc832d5349
[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)
This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for
AMDGPU. In order to support vector floating point input for llvm.lround,
this patch extends the target independent APIs and provide support for
scalarizing. pr98950 is needed to let verifier allow vector floating
point types
2024-07-23 20:34:34 +04:00
Joseph Huber
615b7eeaa9 Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
2024-07-20 09:29:31 -05:00
NAKAMURA Takumi
740161a9b9 Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610
2024-07-20 12:36:57 +09:00
Joseph Huber
c05126bdfc
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.

This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
2024-07-16 06:22:09 -05:00
Kazu Hirata
66cd2e0f9a
[CodeGen] Use range-based for loops (NFC) (#98706) 2024-07-13 13:29:47 -07:00
Farzon Lotfi
0b58f34c98
[X86][CodeGen] Add base trig intrinsic lowerings (#96222)
This change is an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds constraint intrinsics and some lowering cases for
`acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`.
The only x86 specific change was for f80.

https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966
    
The x86 lowering is going to be done in three pr changes with this being
the first.
A second PR will be put up for Loop Vectorizing and then SLPVectorizer.

The constraint intrinsics is also going to be in multiple parts, but
just 2.
This part covers just the llvm specific changes, part2 will cover clang
specifc changes and legalization for backends than have special
legalization
 requirements like aarch64 and wasm.
2024-07-11 15:58:43 -04:00
Manish Kausik H
69192e0193
[LegalizeDAG] Optimize CodeGen for ISD::CTLZ_ZERO_UNDEF (#83039)
Previously we had the same instructions being generated for `ISD::CTLZ` and `ISD::CTLZ_ZERO_UNDEF` which did not take advantage of the fact that zero is an invalid input for `ISD::CTLZ_ZERO_UNDEF`. This commit separates codegen for the two cases to allow for the optimization for the latter case.

The details of the optimization are outlined in #82075

Fixes #82075

Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
2024-07-08 14:01:32 +01:00
Matt Arsenault
2df2373eb8
DAG/GlobalISel: Set disjoint for or in copysign lowering (#97057)
We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.

No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.
2024-06-28 23:03:39 +02:00
Nikita Popov
f2f18459d4 Revert "Intrinsic: introduce minimumnum and maximumnum (#93841)"
As far as I can tell, this pull request was not approved, and
did not go through an RFC on discourse.

This reverts commit 89881480030f48f83af668175b70a9798edca2fb.
This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.
2024-06-21 08:34:04 +02:00
YunQiang Su
8988148003
Intrinsic: introduce minimumnum and maximumnum (#93841)
Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN.
RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86:
Returns NUM but not same with IEEE754-2019's minimumNumber as
     +0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow
IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033
2024-06-21 11:53:08 +08:00
Poseydon42
995835fe6d
[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871)
This PR adds initial support for the `scmp`/`ucmp` 3-way comparison
intrinsics in the SelectionDAG. Some of the expansions/lowerings
are not optimal yet.
2024-06-17 11:16:52 +02:00
Simon Pilgrim
ea2ee5dc2f
[DAG] Add legalization handling for AVGCEIL/AVGFLOOR nodes (#92096)
Always match AVG patterns pre-legalization, and use TargetLowering::expandAVG to expand again during legalization.

I've removed the X86 custom AVGCEILU pattern detection and replaced with combines to try and convert other AVG nodes to AVGCEILU.
2024-06-12 14:11:07 +01:00
Farzon Lotfi
1d87433593
[x86] Add tan intrinsic part 4 (#90503)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294


Much of this change was following how G_FSIN and G_FCOS were used.

Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
-  `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic

resolves https://github.com/llvm/llvm-project/issues/70082
2024-06-05 15:01:33 -04:00
Roger Ferrer Ibáñez
05e6bb40eb
[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)
The current way of lowering `llvm.clear_cache` is a bit unusual. As
suggested by Matt Arsenault we are better off using an ISD node.

This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall
by default named `__clear_cache` and the default legalisation is a
libcall.

This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE`
needed by RISC-V on some platforms.
2024-05-30 14:55:32 +02:00
Matt Arsenault
f68fdb84e1
DAG: Fix losing flags on select when expanding select_cc (#93662)
This was only preserving the flags on the setcc, not the new select.
This was missing presumably due to getSelect not having a flags argument
until recently. Avoids regressions in a future commit.
2024-05-29 22:02:27 +02:00
Matt Arsenault
1eb7f055d9
CodeGen: Fix libcall names for exp10 on the various darwins (#92520)
It's really great that we have the same information duplicated in
TargetLibraryInfo and RuntimeLibcalls which both assume everything by
default.

Should fix issue reported after #92287
2024-05-20 21:02:48 +02:00
Jay Foad
ac092925c3
[SelectionDAG] Widen cttz to cttz_zero_undef (#92514)
Instead of widening e.g. i8 cttz(x) to i16 cttz(x | 0x100), use the more
optimizable form cttz_zero_undef(x | 0x100) since the widened operand is
definitely not zero.
2024-05-17 12:39:40 +01:00
Min-Yih Hsu
f8063ffe73
[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen (#91782)
`vp.reduce.fmaximum/fminimum` are the VP version of
`vector.reduce.fmaximum/fminimum`.
2024-05-10 16:01:47 -07:00
Matt Arsenault
82bb2534d4
AMDGPU: Don't bitcast float typed atomic store in IR (#90116)
Implement the promotion in the DAG.

Depends #90113
2024-05-07 21:43:22 +02:00
Craig Topper
267329d7e0 [LegalizeDAG] Simplify interface to PromoteReduction. NFC
Return an SDValue instead of pushing to the Results vector. Let
the caller do the push.
2024-04-30 09:48:41 -07:00
Min-Yih Hsu
539f626ecd
[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (#90502)
This intrinsic is the VP version of `experimental.cttz.elts`.
2024-04-30 09:27:10 -07:00
Craig Topper
705636a113
[SelectionDAG][RISCV] Move VP_REDUCE* legalization to LegalizeDAG.cpp. (#90522)
LegalizeVectorType is responsible for legalizing nodes that perform an
operation on each element may need to scalarize.

This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR,
SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc.

This patch drops any nodes with a scalar result from LegalizeVectorOps
and handles them in LegalizeDAG instead.

This required moving the reduction promotion to LegalizeDAG. I have
removed the support integer promotion as it was incorrect for integer
min/max reductions. Since it was untested, it was best to assert on it
until it was really needed.

There are a couple regressions that can be fixed with a small DAG
combine which I will do as a follow up.
2024-04-29 22:44:24 -07:00