5708 Commits

Author SHA1 Message Date
Brandon Wu
4a7dbede6b
[RISCV] Support svukte extension (#115657)
This is the extension for "Address-Independent Latency of User-Mode
Faults to Supervisor Addresses".
Spec: https://github.com/riscv/riscv-isa-manual/pull/1564,
https://lf-riscv.atlassian.net/browse/RVS-2977
The spec states that the `svukte` depends on `sv39`, but we don't have
`sv39` yet, so I didn't add it to the implied list.
2024-11-27 10:54:57 +08:00
Craig Topper
43b6b78771
[RISCV][GISel] Use libcalls for f32/f64 G_FCMP without F/D extensions. (#117660)
LegalizerHelp only supported f128 libcalls and incorrectly assumed that
the destination register for the G_FCMP was s32.
2024-11-26 15:48:49 -08:00
Mark Goncharov
80df56e03b
Reapply "[RISCV] Implement tail call optimization in machine outliner" (#117700)
This MR fixes failed test `CodeGen/RISCV/compress-opt-select.ll`.

It was failed due to previously merged commit `[TTI][RISCV]
Unconditionally break critical edges to sink ADDI (PR #108889)`.

So, regenerated `compress-opt-select` test.
2024-11-26 23:39:45 +08:00
Mehdi Amini
f94bd3c933
Revert "[RISCV] Implement tail call optimization in machine outliner" (#117710)
Reverts llvm/llvm-project#115297
Bots are broken
2024-11-26 13:45:47 +01:00
Mark Goncharov
29062329f3
[RISCV] Implement tail call optimization in machine outliner (#115297)
Following up issue #89822, this patch adds opportunity to use tail call
in machine outliner pass.
Also it enables outline patterns with X5(T0) register.
2024-11-26 12:30:37 +03:00
LiqinWeng
c3377af4c3
[RISCV][CostModel] add cost for cttz/ctlz under the non-zvbb (#117515) 2024-11-26 11:40:52 +08:00
Philip Reames
6657d4bd70
[TTI][RISCV] Unconditionally break critical edges to sink ADDI (#108889)
This looks like a rather weird change, so let me explain why this isn't
as unreasonable as it looks. Let's start with the problem it's solving.

```
define signext i32 @overlap_live_ranges(ptr %arg, i32 signext %arg1) { bb:
  %i = icmp eq i32 %arg1, 1
  br i1 %i, label %bb2, label %bb5

bb2:                                              ; preds = %bb
  %i3 = getelementptr inbounds nuw i8, ptr %arg, i64 4
  %i4 = load i32, ptr %i3, align 4
  br label %bb5

bb5:                                              ; preds = %bb2, %bb
  %i6 = phi i32 [ %i4, %bb2 ], [ 13, %bb ]
  ret i32 %i6
}
```

Right now, we codegen this as:

```
	li	a3, 1
	li	a2, 13
	bne	a1, a3, .LBB0_2
	lw	a2, 4(a0)
.LBB0_2:
	mv	a0, a2
	ret
```

In this example, we have two values which must be assigned to a0 per the
ABI (%arg, and the return value). SelectionDAG ensures that all values
used in a successor phi are defined before exit the predecessor block.
This creates an ADDI to materialize the immediate in the entry block.

Currently, this ADDI is not sunk into the tail block because we'd have
to split a critical edges to do so. Note that if our immediate was
anything large enough to require two instructions we *would* split this
critical edge.

Looking at other targets, we notice that they don't seem to have this
problem. They perform the sinking, and tail duplication that we don't.
Why? Well, it turns out for AArch64 that this is entirely an accident of
the existance of the gpr32all register class. The immediate is
materialized into the gpr32 class, and then copied into the gpr32all
register class. The existance of that copy puts us right back into the
two instruction case noted above.

This change essentially just bypasses this emergent behavior aspect of
the aarch64 behavior, and implements the same "always sink immediates"
behavior for RISCV as well.
2024-11-25 18:59:31 -08:00
Pengcheng Wang
6633916ef5
[RISCV] Remove getPostRAMutations (#117527)
We are using `PostMachineScheduler` instead of `PostRAScheduler`
since #68696.

The hook `getPostRAMutations` is only used in `PostRAScheduler` so
it is actually dead code for RISC-V now.
2024-11-26 10:55:43 +08:00
LiqinWeng
dd7aabf7c0
[TTI][RISCV] Deduplicate type-based VP costing of vpcmp/vpcast (#117520)
Refered to: https://github.com/llvm/llvm-project/pull/115983
2024-11-26 10:49:24 +08:00
Craig Topper
c2bb056482
[SelectionDAG][RISCV][AArch64] Allow f16 STRICT_FLDEXP to be promoted. Fix integer promotion of STRICT_FLDEXP in type legalizer. (#117633)
A special case in type legalization wasn't accounting for different
operand numbering between FLDEXP and STRICT_FLDEXP.

AArch64 already asked STRICT_FLDEXP to be promoted, but had no test for
it.
2024-11-25 16:12:45 -08:00
Kazu Hirata
8e510b8472 [RISCV] Fix a warning
This patch fixes:

  llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp:476:25: error: unused
  variable 'ST' [-Werror,-Wunused-variable]
2024-11-25 12:57:05 -08:00
Philip Reames
d733fa1c90
[RISCV] Consolidate VLS codepaths in stack frame manipulation [nfc] (#117605)
We can move the logic from adjustStackForRVV into adjustReg, which
results in the remaining logic being trivially inlined to the two
callers and allows a duplicate copy of the same logic in
eliminateFrameIndex to be pruned.
2024-11-25 12:40:37 -08:00
Craig Topper
ed6749a405 [RISCV] Promote frexp with Zfh.
The default expansion tries to create an illegal integer type after
legalization.
2024-11-25 10:27:37 -08:00
Craig Topper
29828b26fa
[RISCV] Fix double counting scalar CSRs with Zcmp when emitting cfi_offset for RVV CSRs. (#117408)
getCalleeSavedStackSize() already contains RVPushStackSize. Don't
subtract it again.
2024-11-25 10:03:48 -08:00
Raphael Moreira Zinsly
d88ed9357a
[NFC][RISCV] Refactor allocation of the stack space (#116625)
Separates the stack allocations from prologue in preparation for the
stack clash protection support.
2024-11-25 09:36:15 -08:00
Craig Topper
20bd029a40
[RISCV] Promote fldexp with Zfh. (#117396)
The default expansion tries to create i16 operations after type
legalization.

Fixes #117349
2024-11-25 09:08:56 -08:00
David Sherwood
9b76e7fc60
Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031)" (#117556)
This reverts commit 22ec44f509ff266b581dbb490d7b040473b7c31a.
2024-11-25 13:49:21 +00:00
Luke Lau
15fadeb2aa
[RISCV] Add cost for @llvm.experimental.vp.splat (#117313)
This is split off from #115274. There doesn't seem to be an easy way to
share this with getShuffleCost since that requires passing in a real
insert_element operand to get it to recognise it's a scalar splat.

For i1 vectors we can't currently lower them so it returns an invalid
cost.

---------

Co-authored-by: Shih-Po Hung <shihpo.hung@sifive.com>
2024-11-25 11:28:46 +01:00
LiqinWeng
db14010405
[RISCV][TTI] Implement cost of intrinsic abs with LMUL (#115813) 2024-11-25 17:35:58 +08:00
David Sherwood
22ec44f509
[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031)
For IR like this:

  %icmp = icmp ult <4 x i32> %a, splat (i32 5)
  %res = extractelement <4 x i1> %icmp, i32 1

where there is only one use of %icmp we can take a similar approach
to what we already do for binary ops such add, sub, etc. and convert
this into

  %ext = extractelement <4 x i32> %a, i32 1
  %res = icmp ult i32 %ext, 5

For AArch64 targets at least the scalar boolean result will almost
certainly need to be in a GPR anyway, since it will probably be
used by branches for control flow. I've tried to reuse existing code
in scalarizeExtractedBinop to also work for setcc.

NOTE: The optimisations don't apply for tests such as
extract_icmp_v4i32_splat_rhs in the file

CodeGen/AArch64/extract-vector-cmp.ll

because scalarizeExtractedBinOp only works if one of the input
operands is a constant.
2024-11-25 09:25:01 +00:00
Craig Topper
3fb0bea859 [RISCV][GISel] Add register class to some isel output patterns so they can be imported.
This makes (fcopysign X, (fneg Y)) patterns work.
2024-11-24 19:29:52 -08:00
hev
e26af0938c
[llvm] Add BasicTTIImpl::areInlineCompatible for target feature subset checks (#117493)
This patch moves the `areInlineCompatible` implementation from multiple
subclasses (`AArch64TTIImpl`, `RISCVTTIImpl`, `WebAssemblyTTIImpl`) to
the base class `BasicTTIImpl`. The new implementation checks whether the
callee's target features are a subset of the caller's, enabling
consistent behavior across targets. Subclasses now simply delegate to
the base implementation, reducing code duplication and improving
maintainability.
2024-11-25 11:22:49 +08:00
Craig Topper
bb5bbe523d [RISCV][GISel] Support s32/s64 G_FSUB/FDIV/FNEG without F/D extensions.
Use libcalls for G_FSUB/FDIV. Use integer operations for G_FNEG.

Copy most of the IR tests for arithmetic from SelectionDAG.
2024-11-24 18:22:12 -08:00
LiqinWeng
48b13ca48b
[RISCV][CostModel] cost of vector cttz/ctlz under ZVBB (#115800) 2024-11-24 09:18:18 +08:00
Craig Topper
213b849c5e [RISCV][GISel] Use libcalls for some FP instructions when F/D aren't present.
This is based on what fails when adding integer only RUN lines to
float-intrinsics.ll and double-intrinsics.ll.

We're still missing a lot of test cases that SelectionDAG has. These
will be added in future patches.
2024-11-23 11:43:14 -08:00
Alexey Bataev
7523086a05
[SLP]Use getExtendedReduction cost and fix reduction cost calculations
Patch uses getExtendedReduction for reductions of ext-based nodes + adds
cost estimation for ctpop-kind reductions into basic implementation and
RISCV-V specific vcpop cost estimation.

Reviewers: RKSimon, preames

Reviewed By: preames

Pull Request: https://github.com/llvm/llvm-project/pull/117350
2024-11-22 16:12:53 -05:00
Pengcheng Wang
4da960b898 [RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 22:58:54 +08:00
Mikhail Goncharov
d1dae1e861 Revert "[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)" chain
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.

multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
2024-11-22 14:09:13 +01:00
Pengcheng Wang
775148f236
[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 19:54:45 +08:00
Craig Topper
f84fc44f1a [RISCV][GISel] Make s16->s32 G_ANYEXT/SEXT/ZEXT legal. 2024-11-21 22:45:25 -08:00
Jim Lin
bd15c7c1ca
[RISCV] Make A implies Zaamo and Zalrsc (#116907)
Ref:
https://github.com/riscv/riscv-isa-manual/blob/main/src/a-st-ext.adoc.
2024-11-22 10:35:38 +08:00
Craig Topper
8e65b72691
[RISCV] Fix double counting CSRs with Zcmp in RISCVFrameLowering::getFrameIndexReference. (#117207)
The Zcmp callee saved registers are already accounted for in
getCalleeSavedStackSize(). Subtracting RVPushStackSize subtracts
them a second time leading to incorrect stack offsets during frame
index elimination.
    
This should have been removed in
0de2b26942f890a6ec84cd75ac7abe3f6f2b2e37
when Zcmp handling was changed. Prior to that, RVPushStackSize was
not included in getCalleeSavedStackSize(). The commit message at the
time noted that Zcmp+RVV was likely broken.
2024-11-21 13:53:15 -08:00
Craig Topper
29afbd5893
[RISCV] Add DAG combine to convert (iX ctpop (bitcast (vXi1 A))) into vcpop.m. (#117062)
This only handles the simplest case where vXi1 is a legal vector type.
If the vector type isn't legal we need to go through type legalization,
but the pattern gets much harder to recognize after that. Either because
ctpop gets expanded due to Zbb not being enabled, or the bitcast
becoming a bitcast+extractelt, or the ctpop being split into multiple
ctpops and adds, etc.
2024-11-21 11:12:07 -08:00
Min-Yih Hsu
0165f8817c
[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models (#116893)
For each RVV instruction we should have a single WriteRes assignment to
the worst case scheduling class. This assignment is usually equal to
that of the largest LMUL + smallest SEW. My #114317 accidentally made
two of these assignments on `WriteVSHA2MSV_WorstCase`. This won't affect
our MachineScheduler nor most of our llvm-mca use cases (assuming you
populate the correct LMUL and SEW), yet it's not ideal either.

This patch fixes this issue by assigning the correct numbers and
resource mapping to `WriteVSHA2MSV_WorstCase`, which is equal to that of
largest LMUL + _largest_ SEW (Zvknh's scheduling properties are
special). I also added a MCA test to make sure we always pick up the
correct worst case numbers for P600's scheduling model.

Original issue was reported by @reidtatge
2024-11-21 10:59:46 -08:00
Craig Topper
cdd1e27124
[X86][RISCV] Don't emit JumpTableDebugInfo unless triple is OSBinFormatCOFF. (#117083)
This makes the override in RISCV and X86 consistent with the base class
implementation of expandIndirectJTBranch.
2024-11-21 09:38:16 -08:00
Craig Topper
e9c561e934 [RISCV][GISel] Add atomic load/store test. Add additional atomic load/store isel patterns." 2024-11-20 22:23:18 -08:00
Craig Topper
4087b871c5
[RISCV][GISel] Move G_BRJT expansion to legalization (#73711)
Instead of custom selecting a bunch of instructions, we can expand to
generic MIR during legalization.
2024-11-20 13:43:36 -08:00
Sam Elliott
408659c5b5
[RISCV] Merge GPRPair and GPRF64Pair (#116094)
As suggested by Craig, this tries to merge the two sets of register
classes created in #112983, GPRPair* and GPRF64Pair*.

- I added some explicit annotations to `RISCVInstrInfoD.td` which fixed
the type inference issues I was seeing from tablegen for select
patterns.
- I've had to make the behaviour of `splitValueIntoRegisterParts` and
`joinRegisterPartsIntoValue` cover more cases, because you cannot
bitcast to/from untyped (the bitcast would otherwise have been inserted
automatically by TargetLowering code).
- I apparently didn't need to change `getNumRegisters` again, which
continues to tell me there's a bug in the code for tied inputs. I added
some more test coverage of this case but it didn't seem to help find the
asserts I was finding before - I think the difference is between the
default behaviour for integers which doesn't apply to floats.
- There's still a difference between BuildGPRPair and BuildPairF64 (and
the same for SplitGPRPair and SplitF64). I'm not happy with this, I
think it's quite confusing, as they're very similar, just differing in
whether they give a `untyped` or a `f64`. I haven't really worked out
how the DAGCombiner copes if one meets the other, I know we have some of
this for the f64 variants already, but they're a lot more complex than
the GPRPair variants anyway.
2024-11-20 10:08:55 +00:00
Craig Topper
2bf6751522 [RISCV] Add IsRV32 some patterns in RISCVInstrInfoXTHead.td.
This restores the code to its original state before I experimented
with making i32 a legal type.
2024-11-19 21:41:14 -08:00
Petr Penzin
41c86ca714
[RISCV] Add TT-Ascalon-d8 processor (#115100)
Ascalon is an out-of-order CPU core from Tenstorrent. Overview:
https://tenstorrent.com/ip/tt-ascalon

Adding 8-wide version, -mcpu=tt-ascalon-d8. Scheduling model will be
added in a separate PR.

---------

Co-authored-by: Anton Blanchard <antonb@tenstorrent.com>
2024-11-19 14:20:55 -08:00
Craig Topper
eff60d83b0 [RISCV][GISel] Make extended loads and truncating stores with s16 register type and s8 memory type legal.
This addresses some failures I've seen in testing on real code.
2024-11-19 11:57:35 -08:00
Sam Elliott
c4030c896d
[RISCV] Fix FP64 DinX R Regclass (#116688)
This was a typo in llvm/llvm-project#112983 that didn't cause build
failures but is still wrong.
2024-11-19 12:42:27 +00:00
Luke Lau
1e897ed28d
[TTI][RISCV] Deduplicate type-based VP costing (#115983)
We have a lot of code in RISCVTTIImpl::getIntrinsicInstrCost for vp
intrinsics, which just forward the cost to the underlying non-vp cost
function.

However I just also noticed that there is generic code in BasicTTIImpl's
getIntrinsicInstrCost that does the same thing, added in #67178. The
only difference is that BasicTTIImpl doesn't yet handle it for
type-based costing. There doesn't seem to be any reason that it can't
since it's just inspecting the argument types.

This shuffles the VP costing up to handle both regular and type-based
costing, which allows us to deduplicate some of the VP specific costing
in RISCVTTIImpl by delegating it to BasicTTIImpl.h. More of those nodes
can be moved over to BasicTTIImpl.h later.

It's not NFC since it picks up a couple of VP nodes that had slipped
through the cracks. Future PRs can begin to move more of the code from
RISCVTTIImpl to BasicTTIImpl.
2024-11-19 16:20:29 +08:00
Craig Topper
cfc574a6cd [RISCV] Use the OperandTransform field of a couple PatLeafs to simplify isel patterns. NFC 2024-11-18 16:37:48 -08:00
Craig Topper
ac17b50f50 [RISCV] Use getSignedTargetConstant. NFC 2024-11-18 12:20:44 -08:00
Craig Topper
900c056531
[RISCV] Add an implementation of findRepresentativeClass to assign i32 to GPRRegClass for RV64. (#116165)
This is an alternative fix for #81192. This allows the SelectionDAG
scheduler to be able to find a representative register class for i32 on
RV64. The representative register class is the super register class with
the largest spill size that is also legal. The default implementation of
findRepresentativeClass only works for legal types which i32 is not for
RV64.

I did some investigation of why tablegen uses i32 in output patterns on
RV64. It appears it comes down to a function called
ForceArbitraryInstResultType that picks a type for the output
pattern when the isel pattern isn't specific enough. I believe it picks
the smallest type(lowested numbered) to resolve the conflict.

A similar issue occurs for f16 and bf16 which both use the FPR16
register class. If the isel pattern doesn't specify, tablegen may find
both f16 and bf16 and may pick bf16 from Zfh pattern when Zfbfmin isn't
present. Since bf16 isn't legal in that case, findRepresentativeClass
will fail.

For i8, i16, i32, this patch calls the base class with XLenVT to get the
representative class since XLenVT is always legal.

For bf16/f16, we call the base class with f32 since all of the f16/bf16
extensions depend on either F or Zfinx which will make f32 a legal type.
The final representative register class further depends on whether D or
Zdinx is also enabled, but that should be handled by the default
implementation.
2024-11-18 10:07:20 -08:00
Sam Elliott
4615cc38f3
[RISCV] Inline Assembly Support for GPR Pairs ('R') (#112983)
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `R` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `R`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `R` is a valid inline assembly
  constraint.

- This also allows `R` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
2024-11-18 17:45:58 +00:00
Brandon Wu
9d7026500d
[RISCV] Correct the precedence in isVRegClass (#116579)
Right shift has higher precedence than bitwise and, so it should be
parentheses around & operator. This case works as expected because
IsVRegClassShift is 0, other cases will fail.
2024-11-18 18:12:57 +08:00
dlav-sc
0c04d43e80
[RISCV][NFC] refactor CFI emitting (#114227)
This patch refactor PR https://github.com/llvm/llvm-project/pull/110810
to remove code duplication.
2024-11-18 12:25:34 +03:00
Craig Topper
45e882e2bf [RISCV] Add IsRV32 to some isel patterns not needed for RV64. 2024-11-17 19:44:56 -08:00