It is currently not possible to use "RVV type" and "RVV intrinsics" if
the "zve32x" is not enabled globally. However in some cases we may want
to use them only in some functions, for instance:
```
#include <riscv_vector.h>
__attribute__((target("+zve32x")))
vint32m1_t rvv_add(vint32m1_t v1, vint32m1_t v2, size_t vl) {
return __riscv_vadd(v1, v2, vl);
}
int other_add(int i1, int i2) {
return i1 + i2;
}
```
, it is supposed to be compilable even the vector is not specified, e.g.
`clang -target riscv64 -march=rv64gc -S test.c`.
[RISCV] RISCV vector calling convention (1/2)
This is the vector calling convention based on
https://github.com/riscv-non-isa/riscv-elf-psabi-doc,
the idea is to split between "scalar" callee-saved registers
and "vector" callee-saved registers. "scalar" ones remain the
original strategy, however, "vector" ones are handled together
with RVV objects.
The stack layout would be:
|--------------------------| <-- FP
| callee-allocated save |
| area for register varargs|
|--------------------------|
| callee-saved registers | <-- scalar callee-saved
| (scalar) |
|--------------------------|
| RVV alignment padding |
|--------------------------|
| callee-saved registers | <-- vector callee-saved
| (vector) |
|--------------------------|
| RVV objects |
|--------------------------|
| padding before RVV |
|--------------------------|
| scalar local variables |
|--------------------------| <-- BP
| variable size objects |
|--------------------------| <-- SP
Note: This patch doesn't contain "tuple" type, e.g. vint32m1x2.
It will be handled in https://github.com/riscv-non-isa/riscv-elf-psabi-doc (2/2).
Differential Revision: https://reviews.llvm.org/D154576
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
In an LTO build, we don't set the ELF attributes to indicate what
extensions were compiled with. The target CPU/Attrs in
RISCVTargetMachine do not get set for an LTO build. Each function gets a
target-cpu/feature attribute, but this isn't usable to set ELF attributs
since we wouldn't know what function to use. We can't just once since it
might have been compiler with an attribute likes target_verson.
This patch adds the ISA as Module metadata so we can retrieve it in the
backend. Individual translation units can still be compiled with
different strings so we need to collect the unique set when Modules are
merged.
The backend will need to combine the unique ISA strings to produce a
single value for the ELF attributes. This will be done in a separate
patch.
In the beginning, Clang only emitted atomic IR for operations it knew
the
underlying microarch had instructions for, meaning it required
significant
knowledge of the target. Later, the backend acquired the ability to
lower
IR to libcalls. To avoid duplicating logic and improve logic locality,
we'd like to move as much as possible to the backend.
There are many ways to describe this change. For example, this change
reduces the variables Clang uses to decide whether to emit libcalls or
IR, down to only the atomic's size.
GCC has supported a generic constraint "s" for a long time (since at
least 1992), which references a symbol or label with an optional
constant offset. "i" is a superset that also supports a constant
integer.
GCC's RISC-V port also supports a machine-specific constraint "S",
which cannot be used with a preemptible symbol. (We don't bother to
check preemptibility.) In PIC code, an external symbol is preemptible by
default, making "S" less useful if you want to create an artificial
reference for linker garbage collection, or define sections to hold
symbol addresses:
```
void fun();
// error: impossible constraint in ‘asm’ for riscv64-linux-gnu-gcc -fpie/-fpic
void foo() { asm(".reloc ., BFD_RELOC_NONE, %0" :: "S"(fun)); }
// good even if -fpie/-fpic
void foo() { asm(".reloc ., BFD_RELOC_NONE, %0" :: "s"(fun)); }
```
This patch adds support for "s". Modify https://reviews.llvm.org/D105254
("S") to handle multi-depth GEPs (https://reviews.llvm.org/D61560).
The Zicond extension was ratified in the last few months, with no
changes that affect the LLVM implementation. Although there's surely
more tuning that could be done about when to select Zicond or not, there
are no known correctness issues. Therefore, we should mark support as
non-experimental.
GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values
* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)
AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.
TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.
There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.
Co-authored-by: Paul Kirth <paulkirth@google.com>
Everytime an extension is added, this test will need to have the negative
extension appended to multiple CHECK lines where we're overriding the arch.
This is quite time consuming since it needs to be in the right order, so this
replaces the explicit list of negative extensions with a regexp instead.
This patch reworks RISCVTargetInfo::initFeatureMap to fix the issue
described
in
https://github.com/llvm/llvm-project/pull/74889#pullrequestreview-1773445559
(and is an alternative to #75804)
When a full arch string is specified, a "full" list of extensions is now
passed
after the __RISCV_TargetAttrNeedOverride marker feature, which includes
any
negative features that disable ISA extensions.
In initFeatureMap, there are now two code paths:
1. If the arch string was overriden, use the "full" list of override
features,
only adding back any non-isa features that were specified.
Using the full list of positive and negative features will mean that the
target-cpu will have no effect on the final arch, e.g.
__attribute__((target("arch=rv64i"))) with -mcpu=sifive-x280 will have
the
features for rv64i, not a mix of both.
2. Otherwise, parse and *append* the list of implied features. By
appending, we
turn back on any features that might have been disabled by a negative
extension, i.e. this handles the case fixed in #74889.
This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.
The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.
`RVE` can be combined with all current standard extensions.
The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.
If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.
To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.
FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.
Differential Revision: https://reviews.llvm.org/D70401
Set the writable and dead_on_unwind attributes for sret arguments. These
indicate that the argument points to writable memory (and it's legal to
introduce spurious writes to it on entry to the function) and that the
argument memory will not be used if the call unwinds.
This enables additional MemCpyOpt/DSE/LICM optimizations.
collectNonISAExtFeature was returning any negative extension features,
e.g.
given an input of
+zifencei,+m,+a,+save-restore,-zbb,-relax,-zfa
It would return
+save-restore,-zbb,-relax,-zfa
Because negative extensions aren't emitted when calling
toFeatureVector(), and
so were considered missing. Hence why we still see "-zfa" and "-zfb" in
the tests for
the full arch string attributes, even though with a full arch string we
should be overriding the extensions.
This fixes it by using RISCVISAInfo::isSupportedExtensionFeature instead
to
check if a feature is an ISA extension.
This patch adds missing `norecurse` attrs to funcs that only call intrinsics with `nocallback` attrs.
Fixes the regression found in https://github.com/dtcxzyw/llvm-opt-benchmark/pull/45#discussion_r1436148743.
The function loses `norecurse` attr because it calls `@llvm.fabs.f64`, which is not marked as `norecurse`.
Since `norecurse` is not a default attribute of intrinsics and it is
ambiguous for intrinsics, I decided to use the existing `callback`
attributes.
> nocallback
This attribute indicates that the function is only allowed to jump back
into caller’s module by a return or an exception, and is not allowed to
jump back by invoking a callback function, a direct, possibly
transitive, external function call, use of longjmp, or other means. It
is a compiler hint that is used at module level to improve dataflow
analysis, dropped during linking, and has no effect on functions defined
in the current module.
See also https://llvm.org/docs/LangRef.html#function-attributes.
The RISC-V vector crypto extensions have been ratified. This patch
updates the Clang and LLVM support for these extensions to be
non-experimental, while leaving the C intrinsics as experimental since
the C intrinsics are not yet standardized.
Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
d80e46d added support for the target function attribute. However, it
turns out that commit has a nasty bug/oversight. As the tests in that
revision show, everything works if clang -cc1 is directly invoked. I was
suprised to learn this morning that compiling with clang (i.e. the
typical user workflow) did not work.
The bug is that if a set of explicit negative extensions is passed to
cc1 at the command line (as the clang driver always does), we were
copying these negative extensions to the end of the rewritten extension
list. When this was later parsed, this had the effect of turning back
off any extension that the target attribute had enabled.
This patch updates the logic to only propagate the features from the
input which don't appear in the rewritten form in either positive or
negative form.
Note that this code structure is still highly suspect. In particular I'm
fairly sure that mixing extension versions with this code will result in
odd results. However, I figure its better to have something which mostly
works than something which doesn't work at all.
Ever since 98c90a1 (ISel: introduce vector ISD::LRINT, ISD::LLRINT;
custom RISCV lowering) landed, there have been several discussions on
how the lrint and llrint libcalls would lower to LLVM IR via clang on
RV32 and RV64, in an effort to enable vectorization of lrint and llrint
via SLPVectorizer and LoopVectorize. This patch adds a new
math-builtins.c test to the RISC-V target to test the lowering of all
math libcalls, including lrint and llrint.
This reverts commit 8434b0b9d39b7ffcd1f7f7b5746151e293620e0d. #72216
This commit broke the multiple buildbots, looks like the extension in
`NUM_PREDEF_TYPE_IDS` might have broken some inheriting usages, causing
indeterminate results for the compiler. Investigating the issue now.
The first commit extends the capacity from the compiler infrastructure,
and the second commit continues the effort in #71140 to introduce tuple
types for bfloat16.
-mcmodel= is supported for a few architectures. Reject the option for
other architectures.
* -mcmodel= is unsupported on x86-32.
* -mcmodel=large is unsupported for PIC on AArch64.
* -mcmodel= is unsupported for aarch64_32 triples.
* https://reviews.llvm.org/D67066 (for RISC-V) made
-mcmodel=medany/-mcmodel=medlow aliases for all architectures. Restrict
this to RISC-V.
* llvm/lib/Target/Sparc has some small/medium/large support, but the
values listed on https://gcc.gnu.org/onlinedocs/gcc/SPARC-Options.html
had been supported before https://reviews.llvm.org/D67066. Consider
-mcmodel= unsupported for Sparc.
* https://reviews.llvm.org/D106371 translated -mcmodel=medium to
-mcmodel=large on AIX, even for 32-bit systems. Retain this behavior but
reject -mcmodel= for other PPC32 systems.
In general the accept/reject behavior is more similar to GCC.
err_drv_invalid_argument_to_option is less clear than
err_drv_unsupported_option_argument. As the supported values are
different for
different architectures, add a
err_drv_unsupported_option_argument_for_target
for better clarity.
This patch enables some fp16 vector type builtins that don't use fp arithmetic instruction for zvfhmin without zvfh.
Include following builtins:
vector load/store,
vector reinterpret,
vmerge_vvm,
vmv_v.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151869