5 Commits

Author SHA1 Message Date
Fangrui Song
eabaee0c59
[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)
R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
`call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not
useful and can be removed now (matching AArch64 and PowerPC).

GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09
(70f35d72ef04cd23771875c1661c9975044a749c).

Without this patch, unconditionally changing MO_CALL to MO_PLT could
create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler
and GNU assembler.
2024-01-07 12:09:44 -08:00
Philip Reames
a63bd7e99b [RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands
In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG *does* CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282.

This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers.

We may end up backporting this into the 17.x release branch.  Given how late in the release cycle this is landing, that's much less likely now, but still a possibility.

Differential Revision: https://reviews.llvm.org/D156909
2023-08-14 12:57:38 -07:00
Philip Reames
92b5a3405d [RISCV] Remove legacy TA/TU pseudo distinction for unary instructions
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. In D153155, we started removing the legacy distinction between unsuffixed (TA) and _TU pseudos. This patch continues that effort for the unary instruction families.

The change consists of a few interacting pieces:
* Adding a vector policy operand to VPseudoUnaryNoMaskTU.
* Then using VPseudoUnaryNoMaskTU for all cases where VPseudoUnaryNoMask was previously used and deleting the unsuffixed form.
* Then renaming VPseudoUnaryNoMaskTU to VPseudoUnaryNoMask, and adjusting the RISCVMaskedPseudo table to use the combined pseudo.
* Fixing up two places in C++ code which manually construct VMV_V_* instructions.

Normally, I'd try to factor this into a couple of changes, but in this case, the table structure is tied to naming and thus we can't really separate the otherwise NFC bits.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D153899
2023-06-29 07:34:14 -07:00
Kito Cheng
cf40b8a4dd [RISCV] Pass vector argument by stack correctly.
We've a argument lowering logic to prevent floating-point value pass
passed with bit-conversion, but that rule should not applied to vector
arguments.

---

How to pass argument to `foo`:

```
tail call void @foo(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer)
```

`foo` take 13 arguments, first 8 argument pass in GPR, and next 2 LMUL 8 vector
arguments passed in v8-v23, and now we run out of argument register for GPR and
vector register, so we must pass last LMUL 8 vector argument by stack.

Which means we should reserve `vlenb * 8` byte for stack for the last
vector argument.

Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D145938
2023-03-15 17:22:47 +08:00
Kito Cheng
ba1c7731f1 [RISCV] Precommit test to show wrong way to pass scalable FP vector on stack
Test case to demo scaleable vector on stack will cause stack corruption.

Detail explan what happened:

```
tail call void @foo(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer)
```

`foo` take 13 arguments, first 8 argument pass in GPR, and next 2 LMUL 8 vector
arguments passed in v8-v23, and now we run out of argument register for GPR and
vector register, so we must pass last LMUL 8 vector argument by stack.

However LLVM only reserve 8 byte on stack for the LMUL 8 vector
argument, it will cause stack corruption when we try to store that into
stack.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145934
2023-03-15 17:21:07 +08:00