669 Commits

Author SHA1 Message Date
Kazu Hirata
e3a3f78f35
[CodeGen] Use llvm::append_range (NFC) (#133603) 2025-03-29 16:53:02 -07:00
Tim Gymnich
1d0005a69a
[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (#133466)
- rename `GISelKnownBits` to `GISelValueTracking` to analyze more than
just `KnownBits` in the future
2025-03-29 11:51:29 +01:00
Kazu Hirata
f3e8e80563
[llvm] Construct SmallVector with ArrayRef (NFC) (#132560) 2025-03-22 13:11:31 -07:00
David Green
53a395fda3
[AArch64][GlobalISel] Legalize more CTPOP vector types. (#131513)
Similar to other operations, s8, s16 s32 and s64 vector elements are
clamped to legal vector sizes, odd number of elements are widened to the
next power-2 and s128 is scalarized.

This helps legalize cttz as well as ctpop.
2025-03-20 07:21:01 +00:00
David Green
b0876994eb
[AArch64][GlobalISel] Clean up CTLZ vector type legalization. (#131514)
Similar to other operations, s8, s16 and s32 vector elements are clamped
to legal vector sizes, but in this case s64 are scalarized to use the
gpr instructions. This allows vector types to split as opposed to
scalarizing.
2025-03-19 19:28:36 +00:00
David Green
bd1be8a242
[CodeGen][GlobalISel] Add a getVectorIdxWidth and getVectorIdxLLT. (#131526)
From #106446, this adds a variant of getVectorIdxTy that returns an LLT.
Many uses only look at the width, so a getVectorIdxWidth was added as
the common base.
2025-03-18 08:31:11 +00:00
Craig Topper
caa798cb1e [GlobalISel] Use Register. NFC 2025-03-02 23:46:18 -08:00
David Green
70ed381b16
[GlobalISel][AArch64] Fix fptoi.sat lowering. (#127901)
The SDAG version uses fminnum/fmaxnum, in converting it to fcmp+select
it appears the order of the operands was chosen badly. This switches the
conditions used to keep the constant on the RHS.
2025-02-20 12:22:11 +00:00
Matt Arsenault
37c341df28 Revert "AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711)"
This reverts commit 36eaf0daf5d6dd665d7c7a9ec38ea22f27709fed.

This is not a sound approach to dealing with this instruction change.
The new behavior is a different opcode pair, not a modifier on the
existing opcode.
2025-02-20 10:19:14 +07:00
Changpeng Fang
36eaf0daf5
AMDGPU: Don't canonicalize fminnum/fmaxnum if targets support IEEE fminimum(maximum)_num (#127711)
For targets that support IEEE fminimum_num/fmaximum_num, the
corresponding *_min_num_fXY/*_max_num_fXY instructions themselves
already did the canonicalization for the inputs. As a result, we do not
need to explicitly canonicalize the inputs for fminnum/fmaxnum.
2025-02-19 11:16:43 -08:00
JaydeepChauhan14
b693e1cf83
[X86][GlobalISel] Enable G_LROUND/G_LLROUND with libcall mapping (#125096) 2025-02-03 14:12:43 +07:00
David Green
070e129304
[AArch64][GlobalISel] Add disjoint handling for add_and_or_is_add. (#123594)
This allows us to easily detect, without known-bits, that the or in a
fshl/fshr is disjoint allowing us to use usra under aarch64.
2025-02-02 21:01:49 +00:00
David Green
ac7c199a63
[AArch64][GlobalISel] Legalize more G_VECREDUCE_ADD operations. (#123392)
Non-power-2 vectors will now be padded with zero elements, smaller
vectors will be widened using anyext, which I believe will be better in
many situations than padding with zeros, although some small types may
prefer being scalarized depending on the code. Padding with zeros may
not be best for all sizes (v5i8 being the worst), we can hopefully
improve that in the future but they no longer fall back. We scalarize
other types like i128.
2025-01-30 22:17:34 +00:00
Amara Emerson
2d53eaff4a
[AArch64][GlobalISel] Fix legalization for <4 x i1> vector stores.
This case is different from the earlier <8 x i1> case handled because it triggers
a legalization failure in lowerStore() that's intended for scalar code.

It also was triggering incorrect bitcast actions in the AArch64 rules that weren't
expecting truncating stores.

With these two fixed, more cases are handled. The code is still bad, including
some missing load promotion in our combiners that result in dead stores hanging
around at the end of codegen. Again, we can fix these in separate changes.

Reviewers: davemgreen, madhur13490, topperc, arsenm

Reviewed By: davemgreen

Pull Request: https://github.com/llvm/llvm-project/pull/121185
2025-01-06 10:22:48 -08:00
Amara Emerson
6b0807fe2b
[AArch64][GlobalISel] Add support for lowering trunc stores of vector bools.
This is essentially a port of TargetLowering::scalarizeVectorStore(), which
is used for the case where we have something like a store of <8 x s8> truncating
to <8 x s1> in memory. The naive lowering is a sequence of extracts to compute
a scalar value to store.

AArch64's DAG implementation has some more smarts to improve this further which
we can do later.

Reviewers: topperc, davemgreen

Pull Request: https://github.com/llvm/llvm-project/pull/121169
2025-01-06 10:21:42 -08:00
Amara Emerson
41ebbed280
[AArch64][GlobalISel] Legalize vector boolean bitcasts to scalars by lowering via stack.
Reviewers: davemgreen, topperc, arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/121171
2025-01-05 21:32:27 -08:00
Amara Emerson
7e3180a2c2
[AArch64][GlobalISel] Add support for widening vector store elements to s8.
Reviewers: topperc, arsenm, davemgreen

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/121170
2025-01-05 21:31:34 -08:00
Craig Topper
54dac27c57
[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_UMAX/G_UMIN. (#120041)
Similar to what we do for unsigned comparisons after #120032.
2024-12-15 23:16:58 -08:00
Craig Topper
115872902b
[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_ICMP. (#120032)
Sign extending i32->i64 is more efficient than zero extend for RV64.
2024-12-15 22:55:58 -08:00
Craig Topper
de1a423c23
[GISel][RISCV][AArch64] Support legalizing G_SCMP/G_UCMP to sub(isgt,islt). (#119265)
Convert the LLT to EVT and call
TargetLowering::shouldExpandCmpUsingSelects to determine if we should do
this.

We don't have a getSetccResultType, so I'm boolean extending the
compares to the result type and using that. If the compares legalize to
the same type, these extends will get removed. Unfortunately, if the
compares legalize to a different type, we end up with truncates or
extends that might not be optimally placed.
2024-12-15 20:47:17 -08:00
David Green
4c8c130847
[AArch64][GlobalISel] Scalarize i128 shufflevector instructions. (#119980)
This, like other operations, scalarizes shuffle vector operations with
types larger than 64bits. ImplicitDef and Freeze are also handled the
same way, to allow them to legalize. The legalization of
fewerElementsVectorShuffle is adjusted to handled scalarization.
2024-12-15 10:44:40 +00:00
Craig Topper
7ece560a50
[GISel] Support narrowing G_ICMP with more than 2 parts. (#119335)
This allows us to support i128 G_ICMP on RV32. I'm not sure how to test
the "left over" part of this as RISC-V always widens to a power of 2
before narrowing.
2024-12-12 09:50:26 -08:00
Tim Gymnich
2db2dc8ab9
[GlobalISel][NFC] Fix LLT Propagation (#119587)
Retain LLT type information by creating new LLTs from the original LLT
instead of only using the original scalar size.

This PR prepares for the [LLT FPInfo
RFC](https://discourse.llvm.org/t/rfc-globalisel-adding-fp-type-information-to-llt/83349/24)
where LLTs will carry additional floating point type information in
addition to the scalar size.
2024-12-12 09:47:46 -08:00
Craig Topper
5797ed660a
[GISel][SDAG] Avoid push_back in loops for some shuffle mask handling. (#119434)
Each call to push_back contains a check to see if the vector needs to
grow. Using resize or giving the size to the constructor can reduce
the number of checks for growing.
2024-12-10 22:18:46 -08:00
Craig Topper
e3284d8cc7
[GISel] Use SmallVector::append instead of copying one element at a time. (#119321) 2024-12-10 07:18:20 -08:00
Craig Topper
7c12418021
[GISel] Avoid creating a virtual register we don't need. (#119305)
narrowScalarAddSub was creating a virtual register and then overwriting
the Register variable without using it. Add an else and only create it
when needed.
2024-12-09 20:23:24 -08:00
Craig Topper
4cf2cf18c9
[RISCV][GISel] Stop over promoting G_SITOFP/UITOFP libcalls on RV64. (#118597)
When we have legal instructions we want to promote to sXLen and let isel
pattern matching removing the and/sext_inreg.

When using a libcall we want to use a 'si' libcall for small types
instead of 'di'. To match the RV64 ABI, we need to sign extend `unsigned
int` arguments. We reuse the shouldSignExtendTypeInLibCall hook from
SelectionDAG.
2024-12-04 10:42:49 -08:00
Craig Topper
a15400d05d
[RISCV][GISel] Support f32/f64 ldexp. (#117941)
The existing libcall lowering in LegalizerHelper.cpp did not account
for one operand being integer. Reuse the G_FPOWI code to fix this.
2024-12-02 13:30:46 -08:00
Craig Topper
bee33b5291
[RISCV][GISel] Support f32/f64 powi. (#117937)
Need to force libcall legalization to treat the integer argument as
signed so that it can be promoted to XLen in call lowering for RV64.
Alternatively we could promote the operand before converting to libcall,
but going through call lowering is closer to what SelectionDAG does.
2024-12-02 09:06:38 -08:00
Craig Topper
43b6b78771
[RISCV][GISel] Use libcalls for f32/f64 G_FCMP without F/D extensions. (#117660)
LegalizerHelp only supported f128 libcalls and incorrectly assumed that
the destination register for the G_FCMP was s32.
2024-11-26 15:48:49 -08:00
Craig Topper
ebcaa57715
[GISel] #undef macros when they are no longer needed. NFC (#117652)
These macros are created inside a function. They should be undefined
before the end of the function.
2024-11-25 18:00:03 -08:00
David Green
d3ce069572
[AArch64][GlobalISel] Legalize ptr shuffle vector to s64 (#116013)
This converts all ptr element shuffle vectors to s64, so that the
existing vector legalization handling can lower them as needed. This
prevents a lot of fallbacks that currently try to generate things like
`<2 x ptr> G_EXT`.

I'm not sure if bitcast/inttoptr/ptrtoint is intended to be necessary
for vectors of pointers, but it uses buildCast for the casts, which now
generates a ptrtoint/inttoptr.
2024-11-23 17:00:51 +00:00
Tex Riddell
c03d09ce3e
[aarch64] atan2 intrinsic lowering (p5) (#112611)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `VecFuncs.def`: define intrinsic to sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing fewerElementsVector handling for
the new atan2 intrinsic
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize atan2.

Part 5 for Implement the atan2 HLSL Function #70096.
2024-10-24 17:53:12 -07:00
Michael Maitland
6bac41496e
[RISCV][GISEL] Legalize G_INSERT_SUBVECTOR (#108859)
This code is heavily based on the SelectionDAG lowerINSERT_SUBVECTOR
code.
2024-10-21 08:49:13 -04:00
Michael Maitland
f957d080e9
[RISCV][GISEL] Legalize G_EXTRACT_SUBVECTOR (#109426)
This is heavily based on the SelectionDAG lowerEXTRACT_SUBVECTOR code.
2024-10-01 14:08:49 -04:00
David Green
9f255d863f
[AArch64][GlobalISel] Lower fp16 abs and neg without fullfp16. (#110096)
This changes the existing promote logic to lower, so that it can use
normal integer operations. A minor change was needed to fneg lower code
to handle vectors.
2024-09-27 07:43:58 +01:00
Evgenii Kudriashov
e9cb44090f
[X86][GlobalISel] Enable scalar versions of G_UITOFP and G_FPTOUI (#100079)
Also add tests for G_SITOFP and G_FPTOSI
2024-09-25 16:15:36 +02:00
Craig Topper
d5d1417659
[RISCV][GISel] Use libcalls for rint, nearbyint, trunc, round, and roundeven intrinsics. (#108779) 2024-09-18 12:07:44 -07:00
David Green
feac761f37
[GlobalISel][AArch64] Add G_FPTOSI_SAT/G_FPTOUI_SAT (#96297)
This is an implementation of the saturating fp to int conversions for
GlobalISel. On AArch64 the converstion instrctions work this way,
producing saturating results. LegalizerHelper::lowerFPTOINT_SAT is
ported from SDAG.

AArch64 has a lot of existing tests for fptosi_sat, covering a wide
range of types. I have tried to make most of them work all at once, but
a few fall back due to other missing features such as f128 handling for
min/max.
2024-09-16 10:33:59 +01:00
Him188
0748f4227c
[AArch64][GlobalISel] Legalize 128-bit types for FABS (#104753)
This patch adds a common lower action for `G_FABS`, which generates `and
x8, x8, #0x7fffffffffffffff` to reset the sign bit. The action does not
support vectors since `G_AND` does not support fp128.


This approach is different than what SDAG is doing. SDAG stores the
value onto stack, clears the sign bit in the most significant byte, and
loads the value back into register. This involves multiple memory ops
and sounds slower.
2024-09-03 12:47:26 +01:00
Sergei Barannikov
4d7a0abae8
[DataLayout] Change return type of getStackAlignment to MaybeAlign (#105478)
Currently, `getStackAlignment` asserts if the stack alignment wasn't
specified. This makes it inconvenient to use and complicates testing.

This change also makes `exceedsNaturalStackAlignment` method redundant.
2024-08-27 22:59:33 +03:00
Sumanth Gundapaneni
e78156a0e2
Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054)
Verifier is updated in a different patch to let the vector types for
llvm.lround and llvm.llround intrinsics.
2024-08-21 12:13:56 -05:00
David Green
10fe531d6c [GlobalISel] Add and use an Opcode variable and update match-table-cxx.td checks. NFC 2024-08-18 11:08:49 +01:00
Him188
ba461f8c62
[AArch64][GlobalISel] Legalize fp128 types as libcalls for G_FCMP (#98452)
- Generate libcall for supported predicates.
- Generate unsupported predicates as combinations of supported
predicates.
- Vectors are scalarized, however some cases like `v3f128_fp128` are still failing, because we failed to legalize G_OR for these types.

GISel now generates the same code as SDAG, however, note the difference
in the `one` case.
2024-07-25 11:07:31 +01:00
Sumanth Gundapaneni
0ee32c4573
[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)
This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AMDGPU.
2024-07-24 23:34:31 +04:00
Sumanth Gundapaneni
fc832d5349
[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)
This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for
AMDGPU. In order to support vector floating point input for llvm.lround,
this patch extends the target independent APIs and provide support for
scalarizing. pr98950 is needed to let verifier allow vector floating
point types
2024-07-23 20:34:34 +04:00
Thorsten Schütt
2d2d6853cf
[GlobalIsel][AArch64] Legalize G_SCMP and G_UCMP (#99820)
https://github.com/llvm/llvm-project/pull/91871
https://github.com/llvm/llvm-project/pull/98774
2024-07-23 10:12:28 +02:00
Joseph Huber
615b7eeaa9 Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
2024-07-20 09:29:31 -05:00
NAKAMURA Takumi
740161a9b9 Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610
2024-07-20 12:36:57 +09:00
Farzon Lotfi
e2f463b5b6
[aarch64] Add hyperbolic and arc trig intrinsic lowering (#98937)
## The change(s)
- `VecFuncs.def`: define intrinsic to  sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for
the new trig intrinsics
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics. aarch64
has specail legalization requirments in `AArch64LegalizerInfo.cpp`. If
we redirect the clang builtin without handling this we will break the
aarch64 compiler

## History
This change is part of an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966

## Why is aarch64 needed
The last step is to redirect the `acos`, `asin`, `atan`, `cosh`, `sinh`,
and `tanh` to emit the intrinsic. We can't emit the intrinsic without
the intrinsics becoming legal for aarch64 in `AArch64LegalizerInfo.cpp`
2024-07-19 10:18:23 -04:00