1760 Commits

Author SHA1 Message Date
Alex Wang
b33a0e6101
[SelectionDAG] Add expansion for llvm.modf intrinsic (#179434)
Targets without a `modf` libcall lower the intrinsic directly, matching
the existing `llvm.frexp` expansion. Targets with an existing libcall
are unchanged.

Fixes #173021
2026-02-04 21:25:47 +01:00
Matt Arsenault
a521774217
DAG: Use poison for unused shuffle operands in legalizer (#177578) 2026-01-23 18:20:56 +01:00
Matt Arsenault
01e6245af4
DAG: Avoid querying libcall info from TargetLowering (#176268)
Libcall lowering decisions should come from the LibcallLoweringInfo
analysis. Query this through the DAG, so eventually the source
can be the analysis. For the moment this is just a wrapper around
the TargetLowering information.
2026-01-16 09:02:49 +00:00
moorabbit
a5fa246435
[Clang] Add __builtin_stack_address (#148281)
Add support for `__builtin_stack_address` builtin. The semantics match
those of GCC's builtin with the same name.

`__builtin_stack_address` returns the starting address of the stack
region that may be used by called functions. It may or may not include
the space used for on-stack arguments passed to a callee (See [GCC
Bug/121013](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121013)).

Fixes #82632.
2026-01-12 10:01:57 +01:00
Trevor Gross
c63d2953a0
[SelectionDAG,GISel] Add f16 soft promotion for lrint, lround, llrint, and llround (#152684)
On platforms that soft promote `half`, using `lrint` intrinsics crashes
with the following:

    SoftPromoteHalfOperand Op #0: t5: i32 = lrint t4

    LLVM ERROR: Do not know how to soft promote this operator's operand!
PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace.
    Stack dump:
0. Program arguments:
/Users/tmgross/Documents/projects/llvm/llvm-build/bin/llc
-mtriple=riscv32
    1.      Running pass 'Function Pass Manager' on module '<stdin>'.
2. Running pass 'RISC-V DAG->DAG Pattern Instruction Selection' on
function '@test_lrint_ixx_f16'

Resolve this by adding a soft promotion. GISel is included since tests
cover both.

Fixes crash tests added in
https://github.com/llvm/llvm-project/pull/152662 for targets that use
`softPromoteHalfType`.

Co-authored-by: Folkert de Vries <folkert@folkertdev.nl>
2026-01-08 12:40:10 +01:00
Luke Lau
ad4bfac732
[IR] Split vector.splice into vector.splice.left and vector.splice.right (#170796)
This PR implements the first change outlined in
https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974?u=lukel

In order to allow non-immediate offsets in the llvm.vector.splice
intrinsic, we need to separate out the "shift left" and "shift right"
modes into two separate intrinsics, which were previously determined by
whether or not the offset is positive or negative.

The description in the LangRef has also been reworded in terms of
sliding elements left or right and extracting either the upper or lower
half as opposed to extracting from a certain index, which brings it
inline with the definition of `llvm.fshr.*`/`llvm.fshl.*`.

This patch teaches AutoUpgrade.cpp to upgrade the old intrinsics into
their new equivalent one based on their offset, so existing uses of
vector.splice should still work.

Uses of llvm.vector.splice in `llvm/test/CodeGen` haven't been replaced
in this PR to keep the diff small and kick the tyres on the AutoUpgrader
a bit. I planned to do this in a follow up NFC but can include it in
this PR if reviewers prefer.

Similarly the shuffle costing kind `SK_Splice` has just been kept the
same for now, to be split into `SK_SpliceLeft` and `SK_SpliceRight`
later.
2026-01-06 15:41:26 +08:00
Ramkumar Ramachandra
9e5e267a03
[ISel] Introduce llvm.clmul intrinsic (#168731)
In line with a std proposal to introduce the llvm.clmul family of
intrinsics corresponding to carry-less multiply operations. This work
builds upon 727ee7e ([APInt] Introduce carry-less multiply primitives),
and follow-up patches will introduce custom-lowering on supported
targets, replacing target-specific clmul intrinsics.

Testing is done on the RISC-V target, which should be sufficient to
prove that the intrinsics work, since no RISC-V specific lowering has
been added.

Ref: https://isocpp.org/files/papers/P3642R3.html

Co-authored-by: Craig Topper <craig.topper@sifive.com>
2026-01-05 20:24:06 +00:00
Craig Topper
1ee3178f95
[LegalizeDAG] Remove unnecessary EVT->MVT->EVT conversion. NFC (#173707)
There doesn't appear to be any reason to use MVT here. All of the uses
expect an EVT.
2025-12-27 09:58:46 -08:00
Craig Topper
877df9e4b9
[SelectionDAG] Make SSHLSAT/USHLSAT obey getShiftAmountTy(). (#173216)
Treat these like other shift operations by allowing the shift amount to
be a different type than the result.

The PromoteIntOp_Shift and LegalizeDAG code are not tested due to lack
of target support.

I'm looking at adding SSHLSAT for the RISC-V P extension. I don't need
this support for that since RISC-V only has one legal type. I just thought it
was odd that they weren't like other shifts.
2025-12-22 10:28:04 -08:00
Craig Topper
ac6afd8e46
[LegalizeDAG] Return after replacing ISD::POISON with ISD::UNDEF. (#173173)
We already replaced the node, we shouldn't run the rest of the code that
still uses the old node.
2025-12-21 09:39:25 -08:00
Nikita Popov
edb45d8ae4 [SDAG] Allow implicit trunc in BUILD_VECTOR legalization
BUILD_VECTOR may have operands larger than the result element type,
in which case it is specified to truncate. As such, allow implicit
truncation.
2025-12-17 15:22:00 +01:00
Matt Arsenault
d8b03f282a
DAG: Use the LibcallImpl to get calling conv in ExpandDivRemLibCall (#172152) 2025-12-13 11:41:24 +00:00
Matt Arsenault
886f54a04c
DAG: Set MachinePointerInfo for stack when expanding divrem libcall (#170537) 2025-12-08 16:25:19 +01:00
Matt Arsenault
27bf5fdcc6
DAG: Add overload of getExternalSymbol using RTLIB::LibcallImpl (#170587) 2025-12-05 22:39:57 +00:00
Matt Arsenault
ac56d6ea28
DAG: Avoid using getLibcallName for function support test (#170583) 2025-12-04 17:49:43 +01:00
Matt Arsenault
90606ae295
DAG: Use poison for filler values on legalize error paths (#170556) 2025-12-03 21:42:19 +00:00
Matt Arsenault
540fd18568
DAG: Avoid using getLibcallName when looking for a divrem call (#170413)
Also introduce an error if it's not available, which is not yet
testable.
2025-12-03 13:01:21 -05:00
Matt Arsenault
c5aace4236
DAG: Move expandMultipleResultFPLibCall to TargetLowering (NFC) (#166988)
This kind of helper is higher level and not general enough to go
directly in SelectionDAG. Most similar utilities are in TargetLowering.
2025-11-12 03:50:33 +00:00
Matt Arsenault
95f2728b5c
DAG: Stop using TargetLibraryInfo for multi-result FP intrinsic codegen (#166987)
Only use RuntimeLibcallsInfo. Remove the helper functions used to
transition.
2025-11-12 02:47:28 +00:00
Matt Arsenault
821d2825a4
RuntimeLibcalls: Remove incorrect sincospi from most targets (#166982)
sincospi/sincospif/sincospil does not appear to exist on common
targets. Darwin targets have __sincospi and __sincospif, so define
and use those implementations. I have no idea what version added
those calls, so I'm just guessing it's the same conditions as
__sincos_stret.

Most of this patch is working to preserve codegen when a vector
library is explicitly enabled. This only covers sleef and armpl,
as those are the only cases tested.

The multiple result libcalls have an aberrant process where the
legalizer looks for the scalar type's libcall in RuntimeLibcalls,
and then cross references TargetLibraryInfo to find a matching
vector call. This was unworkable in the sincospi case, since the
common case is there is no scalar call available. To preserve
codegen if the call is available, first try to match a libcall
with the vector type before falling back on the old scalar search.

Eventually all of this logic should be contained in RuntimeLibcalls,
without the link to TargetLibraryInfo. In principle we should perform
the same legalization logic as for an ordinary operation, trying
to find a matching subvector type with a libcall.
2025-11-10 11:05:08 -08:00
Matt Arsenault
831e79adff
DAG: Merge all sincos_stret emission code into legalizer (#166295)
This avoids AArch64 legality rules depending on libcall
availability.

ARM, AArch64, and X86 all had custom lowering of fsincos which
all were just to emit calls to sincos_stret / sincosf_stret. This
messes with the cost heuristics around legality, because really
it's an expand/libcall cost and not a favorable custom.

This is a bit ugly, because we're emitting code trying to match the
C ABI lowered IR type for the aggregate return type. This now also
gives an easy way to lift the unhandled x86_32 darwin case, since
ARM already handled the return as sret case.
2025-11-04 10:20:00 -08:00
Matt Arsenault
28e9a2832f
DAG: Consider __sincos_stret when deciding to form fsincos (#165169) 2025-10-28 08:28:09 -07:00
Craig Topper
4cbf4408e7
[SelectionDAG] Use getShiftAmountConstant. (#158395)
Many of the shifts in LegalizeIntegerTypes.cpp were using getPointerTy.
2025-09-12 19:49:48 -07:00
Nikita Popov
01bc742185
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
2025-08-15 18:06:07 +02:00
paperchalice
21836f4a49
[SelectionDAG] Remove UnsafeFPMath in LegalizeDAG (#146316)
These global flags hinder further improvements like [[RFC] Honor pragmas
with
-ffp-contract=fast](https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast)
and pass concurrency support. Remove them incrementally.
2025-07-29 08:41:21 +08:00
Craig Topper
8d549cf036
[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)
getNode updates flags correctly for CSE. Calling setFlags after getNode
may set the flags where they don't apply.

I've added a Flags argument to getSelectCC and the signature of getNode that takes
an ArrayRef of EVTs.
2025-07-22 08:06:30 -07:00
Matt Arsenault
7299250c03
DAG: Use fast variants of fast math libcalls (#147481)
Hexagon currently has an untested global flag to control fast
math variants of libcalls. Add fast variants as explicit libcall
options so this can be a flag based lowering decision, and implement
it. I have no idea what fast math flags the hexagon case requires,
so I picked the maximally potentially relevant set of flags although
this probably is refinable per call. Looking in compiler-rt, I'm not
sure if the fast variants are anything more than aliases.
2025-07-13 10:41:45 +09:00
Matt Arsenault
d3d4066409
DAG: Remove dead declaration of ExpandSinCosLibCall (#147673) 2025-07-09 18:47:49 +09:00
Dominik Steenken
acdf1c7526
[DAG] Add generic expansion for ISD::FCANONICALIZE nodes (#142105)
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.

Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
2025-07-08 16:12:17 +01:00
Kazu Hirata
60cd76bc34
[CodeGen] Construct SmallVector with ArrayRef (NFC) (#143391) 2025-06-09 12:45:52 -07:00
Matt Arsenault
b2266d6d79
RuntimeLibcalls: Rename fminimum_num/fmaximum_num enums (#143078)
Add the underscore to match the libm spelling
2025-06-06 16:23:26 +09:00
Nikita Popov
d74831efeb Revert "[SDAG] Fix fmaximum legalization errors (#142170)"
This reverts commit 58cc1675ec7b4aa5bc2dab56180cb7af1b23ade5.

I also made the incorrect assumption that we know both values are
+/-0.0 here as well. Revert for now.
2025-06-04 14:35:30 +02:00
Luke Lau
9a2d4d176a
[SelectionDAG][AArch64] Legalize power of 2 vector.[de]interleaveN (#141513)
After https://github.com/llvm/llvm-project/pull/139893, we now have
[de]interleave intrinsics for factors 2-8 inclusive, with the plan to
eventually get the loop vectorizer to emit a single intrinsic for these
factors instead of recursively deinterleaving (to support scalable
non-power-of-2 factors and to remove the complexity in the interleaved
access pass).

AArch64 currently supports scalable interleaved groups of factors 2 and
4 from the loop vectorizer. For factor 4 this is currently emitted as a
series of recursive [de]interleaves, and normally converted to a target
intrinsic in the interleaved access pass.

However if for some reason the interleaved access pass doesn't catch it,
the [de]interleave4 intrinsic will need to be lowered by the backend.

This patch legalizes the node and any other power-of-2 factor to smaller
factors, so if a target can lower [de]interleave2 it should be able to
handle this without crashing.

Factor 3 will probably be more complicated to lower so I've left it out
for now. We can disable it in the AArch64 cost model when implementing
the loop vectorizer changes.
2025-06-03 12:05:44 +01:00
Nikita Popov
58cc1675ec
[SDAG] Fix fmaximum legalization errors (#142170)
FMAXIMUM is currently legalized via IS_FPCLASS for the signed zero
handling. This is problematic, because it assumes the equivalent integer
type is legal. Many targets have legal fp128, but illegal i128, so this
results in legalization failures.

Fix this by replacing IS_FPCLASS with checking the bitcast to integer
instead. In that case it is sufficient to use any legal integer type, as
we're just interested in the sign bit. This can be obtained via a stack
temporary cast. There is existing FloatSignAsInt functionality used for
legalization of FABS and similar we can use for this purpose.

Fixes https://github.com/llvm/llvm-project/issues/139380.
Fixes https://github.com/llvm/llvm-project/issues/139381.
Fixes https://github.com/llvm/llvm-project/issues/140445.
2025-06-02 10:14:33 +02:00
Craig Topper
dcd62f3674
[SelectionDAG] Rename MemSDNode::getOriginalAlign to getBaseAlign. NFC (#139930)
This matches the underlying function in MachineMemOperand and how it is
printed when BaseAlign differs from Align.
2025-05-16 09:37:02 -07:00
Sergei Barannikov
11a3de7e98
[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#101786)
This is a reland of #99752 with the bug fixed (see test diff in the
third commit in this PR).
All `popcount` libcalls return `int`, but `ISD::CTPOP` returns the type
of the argument, which can be wider than `int`. The fix is to make DAG
legalizer pass the correct return type to `makeLibCall` and sign-extend
the result afterwards.

Original commit message:
The main change is adding CTPOP to `RuntimeLibcalls.def` to allow
targets to use LibCall action for CTPOP. DAG legalizers are changed
accordingly.

Pull Request: https://github.com/llvm/llvm-project/pull/101786
2025-04-23 12:43:05 +03:00
Jonas Paulsson
e19fcb72d7
Fix 'unannotated fall-through between switch labels' warning. (#136000) 2025-04-16 20:25:10 +02:00
Jonas Paulsson
6d03f51f0c
[SystemZ] Add support for 16-bit floating point. (#109164)
- _Float16 is now accepted by Clang.

- The half IR type is fully handled by the backend.

- These values are passed in FP registers and converted to/from float around
  each operation.

- Compiler-rt conversion functions are now built for s390x including the missing
  extendhfdf2 which was added.

Fixes #50374
2025-04-16 20:02:56 +02:00
zhijian lin
378ac572ac
Reland "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135056)
A new ISD::POISON SDNode is introduced to represent the poison value in
the IR, replacing the previous use of ISD::UNDEF
2025-04-10 11:29:14 -04:00
Jakub Kuderski
ef1088f703
Revert "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135060)
Reverts llvm/llvm-project#125883

This PR causes crashes in RISC-V codegen around f16/f64 poison values:
https://github.com/llvm/llvm-project/pull/125883#issuecomment-2787048206
2025-04-09 14:40:56 -04:00
zhijian lin
8fddef8483
[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR. (#125883)
A new ISD::POISON SDNode is introduced to represent the `poison value`
in the IR, replacing the previous use of ISD::UNDEF.
2025-04-07 10:03:05 -04:00
Ethan Kaji
a629b50575
Port NVPTXTargetLowering::LowerCONCAT_VECTORS to SelectionDAG (#120030)
Ports `NVPTXTargetLowering::LowerCONCAT_VECTORS` to
`llvm/lib/CodeGen/SelectionDAG` as requested in
https://github.com/llvm/llvm-project/issues/116695.
2025-03-27 07:40:35 +07:00
LU-JOHN
70aeb89094
Calculate KnownBits from Metadata correctly for vector loads (#128908)
Calculate KnownBits correctly from metadata for vector loads.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-03-25 22:46:30 +07:00
Alex MacLean
ec941a4a04
[NVPTX] Legalize ctpop and ctlz in operation legalization (#130668)
By pulling the truncates and extensions out of operations during
operation legalization we enable more optimization via DAGCombiner.
While the test cases show only cosmetic improvements (unlikely to impact
the final SASS) in real programs the exposure of these truncates can
allow for more optimization.
2025-03-12 09:28:40 -07:00
John Brawn
fb0891387a
[SelectionDAG] Clean up some redundant setting of node flags (NFC) (#130307)
PR #130124 added a use of FlagInserter to the start of
SelectionDAGLegalize::PromoteNode, making some of the places where we
set flags be redundant, so remove them. The places where the setting of
flags remains are in non-floating-point operations.
2025-03-07 17:51:13 +00:00
John Brawn
d19218e507
[SelectionDAG] Preserve fast math flags when legalizing/promoting (#130124)
When we have a floating-point operation that a target doesn't support
for a given type, but does support for a wider type, then there are two
ways this can be handled:

* If the target doesn't have any registers at all of this type then
LegalizeTypes will convert the operation.

* If we do have registers but no operation for this type, then the
operation action will be Promote and it's handled in PromoteNode.

In both cases the operation at the wider type, and the conversion
operations to and from that type, should have the same fast math flags
as the original operation.

This is being done in preparation for a DAGCombine patch which makes use
of these fast math flags.
2025-03-07 14:46:32 +00:00
Jim Lin
94f6b6d538
[SelectionDAG][RISCV] Promote VECREDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM} (#128800)
This patch also adds the tests for VP_REDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM}, which have been supported for a while.
2025-02-28 23:13:30 +08:00
Benjamin Maxwell
19556eccf6
[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705)
This makes the name more consistent with the other helpers.
2025-02-11 11:51:35 +00:00
Benjamin Maxwell
701223ac20
[IR] Add llvm.sincospi intrinsic (#125873)
This adds the `llvm.sincospi` intrinsic, legalization, and lowering
(mostly reusing the lowering for sincos and frexp).

The `llvm.sincospi` intrinsic takes a floating-point value and returns
both the sine and cosine of the value multiplied by pi. It computes the
result more accurately than the naive approach of doing the
multiplication ahead of time, especially for large input values.

```
declare { float, float }          @llvm.sincospi.f32(float  %Val)
declare { double, double }        @llvm.sincospi.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincospi.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincospi.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincospi.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float>  %Val)
```

Currently, the default lowering of this intrinsic relies on the
`sincospi[f|l]` functions being available in the target's runtime (e.g.
libc).
2025-02-11 09:01:30 +00:00
Benjamin Maxwell
4bf97aa818
[IR] Add llvm.modf intrinsic (#121948)
This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly
reusing the lowering for sincos and frexp).

The `llvm.modf` intrinsic takes a floating-point value and returns both
the integral and fractional parts (as a struct).

```
declare { float, float }             @llvm.modf.f32(float  %Val)
declare { double, double }           @llvm.modf.f64(double %Val)
declare { x86_fp80, x86_fp80 }       @llvm.modf.f80(x86_fp80  %Val)
declare { fp128, fp128 }             @llvm.modf.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }     @llvm.modf.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float>  %Val)
```

This corresponds to the libm `modf` function but returns multiple values
in a struct (rather than take output pointers), which makes it easier to
vectorize.
2025-02-07 09:25:13 +00:00