1643 Commits

Author SHA1 Message Date
Shilei Tian
1ca0055f45
[AMDGPU] Add a new target gfx1152 (#94534) 2024-06-06 12:16:11 -04:00
hev
46edc02eaa
[LoongArch] Adjust LA64 data layout by using n32:64 in layout string (#93814)
Although i32 type is illegal in the backend, LA64 has pretty good
support for i32 types by using W instructions.

By adding n32 to the DataLayout string, middle end optimizations will
consider i32 to be a native type. One known effect of this is enabling
LoopStrengthReduce on loops with i32 induction variables. This can be
beneficial because C/C++ code often has loops with i32 induction
variables due to the use of `int` or `unsigned int`.

If this patch exposes performance issues, those are better addressed by
tuning LSR or other passes.
2024-06-06 14:05:56 +08:00
Konstantin Zhuravlyov
2bfa26d30f
AMDGPU: Add missing gfx* generic targets handling in clang (NVPTX, OpenMP runtime) (#94483) 2024-06-05 11:57:17 -04:00
antangelo
ae1596a31a
[AArch64] Support preserve_none calling convention (#91046)
Adds AArch64 support for the `preserve_none` calling convention.
Registers X0-X7, X9-X15 and X19-X28 are caller save, and can be used to
pass arguments. Delegates to AAPCS for all other registers.

Closes #87423
2024-06-03 18:42:08 -04:00
Freddy Ye
6f2794afeb
Fix build warning for '[X86] Support EGPR for inline assembly. (#92338)' (#93777) 2024-05-30 15:25:08 +08:00
Freddy Ye
73f4c2547d
[X86] Support EGPR for inline assembly. (#92338)
"jR": explicitly enables EGPR
"r", "l", "q": enables/disables EGPR w/wo -mapx-inline-asm-use-gpr32
"jr": explicitly enables GPR with -mapx-inline-asm-use-gpr32
-mapx-inline-asm-use-gpr32 will also define a new macro:
`__APX_INLINE_ASM_USE_GPR32__`

GCC patches:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631183.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631186.html
[[PATCH v2] x86: Define _APX_INLINE_ASM_USE_GPR32_
(gnu.org)](https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649003.html)

Reference: https://gcc.godbolt.org/z/nPPvbY6r4
2024-05-30 14:47:47 +08:00
Shengchen Kan
0f7b4b04a5 [X86][Driver] Enable feature ccmp,nf for -mapxf
This is follow-up for #78901 after validation.
2024-05-29 17:34:26 +08:00
Freddy Ye
4def1ce101
Reland "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93136)
This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.
2024-05-24 13:46:34 +08:00
Brendan Dahl
09c5525610
[WebAssembly] Implement prototype f16x8.splat instruction. (#93228)
Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is
incorrect and will be changed to 0x120 soon.
2024-05-23 20:05:22 -07:00
Freddy Ye
aa4069ea96
Revert "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93123)
This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.
2024-05-23 10:25:23 +08:00
Freddy Ye
282d2ab58f
[X86] Remove knl/knm specific ISAs supports (#92883)
Cont. patch after https://github.com/llvm/llvm-project/pull/75580
2024-05-23 09:46:44 +08:00
YunQiang Su
9276a03b54
MIPS/Clang: Add more false option pairs into validateTarget (#91968)
The option pairs include:
	-mfpxx -mips1
	-msoft-float -mmsa
	-mmsa -mabi=32 with 32bit pre-R2 CPUs
	-mfpxx -mmsa
	-mfp32 -mmsa
2024-05-22 22:54:45 +08:00
YunQiang Su
45293b5edb
MIPS/Clang: handleTargetFeatures, add +fp64 if +msa and no other +-fp (#92728)
Commit: d59bc6b5c75384aa0b1e78cc85e17e8acaccebaf
Clang/MIPS: Add +fp64 if MSA and no explicit -mfp option (#91949)
added +fp64 for `clang`, while not for `clang -cc1`. So

   clang -cc1 -triple=mips -target-feature +msa -S

will emit an asm source file without ".module fp=64".
2024-05-21 20:14:46 +08:00
YunQiang Su
073488cb1f
MIPS/Clang: Use FP32 by default if CPU is mips1 (#92122)
FP32 is the only supported FPMode of mips1.
FPXX requires MIPS2+ and FP64 requires MIPS32r2+.
2024-05-20 14:50:26 +08:00
Kazu Hirata
deffae5da1
[clang] Use StringRef::operator== instead of StringRef::equals (NFC) (#91844)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator==/!= outnumber StringRef::equals by a factor of
  24 under clang/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-11 11:38:52 -07:00
Shengchen Kan
575177f610 [X86] Add sub-feature nf (no flags update) for APX
This is a follow-up patch for #74199
2024-05-11 15:55:59 +08:00
Jack Styles
6aac30fa43
Update FEAT_PAuth_LR behaviour for AArch64 (#90614)
Currently, LLVM enables `-mbranch-protection=standard` as `bti+pac-ret`.
To align LLVM with the behaviour in GNU, this has been updated to
`bti+pac-ret+pc` when FEAT_PAuth_LR is enabled as an optional feature
via the `-mcpu=` options. If this is not enabled, then this will revert
to the existing behaviour.
2024-05-10 08:09:02 +01:00
Lukacma
105dd60fc8
[Clang][AArch64] Fixed incorrect _BitInt alignment (#90602)
This patch makes determining alignment and width of BitInt to be target
ABI specific and makes it consistent with [Procedure Call Standard for
the Arm® 64-bit Architecture
(AArch64)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst)
for AArch64 targets.
2024-05-09 10:45:19 +01:00
Felix (Ting Wang)
ea126aebdc
[PowerPC] Tune AIX shared library TLS model at function level (#84132)
Under some circumstance (library loaded with the main program), TLS
initial-exec model can be applied to local-dynamic access(es). We
could use some simple heuristic to decide the update at function level:
* If there is equal or less than a number of TLS local-dynamic access(es)
in the function, use TLS initial-exec model. (the threshold which default to
1 is controlled by hidden option)
2024-05-09 09:50:36 +08:00
Freddy Ye
e44600f3ab
[X86][CFE] Support EGPR in GCCRegNames. (#91323) 2024-05-08 15:07:18 +08:00
Doug Wyatt
ddecadabeb
[clang backend] In AArch64's DataLayout, specify a minimum function alignment of 4. (#90702)
This addresses an issue where the explicit alignment of 2 (for C++ ABI
reasons) was being propagated to the back end and causing under-aligned
functions (in special sections).

This is an alternate approach suggested by @efriedma-quic in PR #90415.

Fixes #90358
2024-05-05 19:05:15 -07:00
luolent
a98a6e95be
Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391)
Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868)

Parenthesis are added as requested on ternary operators with non trivial conditions.

I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.
2024-05-04 18:38:45 +01:00
Heejin Ahn
5d81b1c50a
[WebAssembly] Add all remaining features to bleeding-edge (#90875)
I'm not entirely sure what the criteria for 'bleeding-edge' used to be,
but at this point it seems to be the set of all added features in LLVM.
This adds remaining features to bleeding-edge config.
2024-05-03 14:08:22 -07:00
Chris B
39172bcfe4
[HLSL] Cleanup TargetInfo handling (#90694)
We had some odd places where we set target behaviors. We were setting
the long size in target-specific code, but it should be language-based.
We were not setting the Half float type semantics correctly, and instead
were overriding the query in the AST context.

This change it moves existing code to the right places in the Target so
that as we continue working on target and language feature they are
controlled in the right places.

It also fixes a bug where the size of `half` was computed incorrectly
when native half types are not supported.
2024-05-02 17:15:57 -05:00
zhijian lin
d4a25976df
Implement a subset of builtin_cpu_supports() features (#82809)
The PR implements a subset of features of function
__builtin_cpu_support() for AIX OS based on the information which AIX
kernel runtime variable `_system_configuration` and function call `getsystemcfg()` of
/usr/include/sys/systemcfg.h  in AIX OS can provide.

Following subset of features are supported in the PR

"arch_3_00", "arch_3_1","booke","cellbe","darn","dfp","dscr" ,"ebb","efpsingle","efpdouble","fpu","htm","isel",
"mma","mmu","pa6t","power4","power5","power5+","power6x","ppc32","ppc601","ppc64","ppcle","smt",
"spe","tar","true_le","ucache","vsx"
2024-05-02 14:59:33 -04:00
Heejin Ahn
1c80d322c4
[WebAssembly] Sort target features (NFC) (#90777) 2024-05-02 09:45:58 -07:00
Heejin Ahn
8c64a30412
[WebAssembly] Disable reference types in generic CPU (#90792)
#80923 newly enabled multivalue and reference-types in the generic CPU.
But enabling reference-types ended up breaking up Wasm's Chromium CI
(https://chromium-review.googlesource.com/c/emscripten-releases/+/5500231)
because the way the table index is encoded is different from MVP (u32)
vs. reference-types (LEB), which caused different encodings for
`call_indirect`.

And Chromium CI's and Emscripten's minimum required node version is v16,
which does not yet support reference-types, which does not recognize
that table index encoding. reference-types is first supported in node
v17.2.

We knew the current minimum required node for Emscripten (v16) did not
support reference-types, but thought it was fine because unless you
explicitly use `__funcref` or `__externref` things would be fine, and if
you want to use them explicitly, you would have a newer node. But it
turned out it also affected the encoding of `call_indirect`.

While we are discussing the potential solutions, I will disable
reference-types to unblock the rolls.
2024-05-01 16:50:58 -07:00
Heejin Ahn
7662f95f2c
[WebAssembly] Add preprocessor define for half-precision (#90528)
This adds the preprocessor define for the half-precision feature and
also adds preprocessor tests.
2024-04-30 11:15:53 -07:00
Heejin Ahn
5bbf1ea8f1
[WebAssembly] Enable multivalue and reference-types in generic CPU config (#80923)
This enables multivalue and reference-types in `-mcpu=generic`           
configuration. These proposals have been standardized and supported in   
all major browsers for several years at this point:                      
https://github.com/WebAssembly/proposals/blob/main/finished-proposals.md
2024-04-29 14:23:08 -07:00
Brendan Dahl
d9fd0ddef3
[WebAssembly] Add half-precision feature (#90248)
This currently only defines a constant, but in the future will be used
to gate builtins for experimenting and prototyping half-precision
proposal (https://github.com/WebAssembly/half-precision).
2024-04-26 14:03:21 -07:00
Aaron Ballman
72c373bfdc
[C++17] Support __GCC_[CON|DE]STRUCTIVE_SIZE (#89446)
These macros are used by STL implementations to support implementation
of std::hardware_destructive_interference_size and
std::hardware_constructive_interference_size

Fixes #60174

---------

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2024-04-26 12:05:15 -04:00
Heejin Ahn
88b6186af3
[WebAssembly] Tidy up wasm-target-features.c (#89778)
This tidies up `wasm-target-features.c` cosmetically:
- Sorts the feature tests alphabetically
- Adds a space after colons
2024-04-24 14:26:09 +09:00
Craig Topper
733a87783c
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.

This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
2024-04-23 15:12:36 -07:00
Felix (Ting Wang)
16efd2a4c4
[AIX][TLS][clang] Add -maix-small-local-dynamic-tls clang option (#88829)
This patch adds the clang portion of an AIX-specific option to inform
the
compiler that it can use a faster access sequence for the local-dynamic
TLS model (formally named aix-small-local-dynamic-tls).

This patch mainly references Amy's work on small local-exec TLS support.
2024-04-23 08:44:25 +08:00
Craig Topper
9067070d91
[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954)
This is largely a revert of commit
e81796671890b59c110f8e41adc7ca26f8484d20.

As #88029 shows, there exists hardware that only supports unaligned
scalar.

I'm leaving how this gets exposed to the clang interface to a future
patch.
2024-04-16 15:40:32 -07:00
Justin Bogner
b0ddbfb77d
[clang][SPIR-V] Set AS for the SPIR-V logical triple (#88939)
This was missed in #88455, causing most of the .hlsl to SPIR-V tests to
fail (such as clang\test\Driver\hlsl-lang-targets-spirv.hlsl)
2024-04-16 12:09:32 -07:00
Joseph Huber
9e7aab951f
[CUDA] Rename SM_32 to SM_32_ to work around AIX headers (#88779)
Summary:
AIX headers define this, so we need to work around it. In the future
this will be removed but for now we should just rename it to avoid these
issues.
2024-04-16 07:43:13 -05:00
Alex Voicu
1120d8e6f7
[clang][CodeGen] Add AS for Globals to SPIR & SPIRV datalayouts (#88455)
Currently neither the SPIR nor the SPIRV targets specify the AS for
globals in their datalayout strings. This is problematic because
CodeGen/LLVM will default to AS0 in this case, which produces Globals
that end up in the private address space for e.g. OCL, HIPSPV or SYCL.
This patch addresses it by completing the datalayout string.
2024-04-16 11:37:29 +01:00
Freddy Ye
db2fb3d96b
[X86] Define __APX_F__ when APX is enabled. (#88343)
Relate gcc patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648789.html
2024-04-11 16:57:32 +08:00
Eli Friedman
71097e9271
[ARM64EC] Add support for parsing __vectorcall (#87725)
MSVC doesn't support generating __vectorcall calls in Arm64EC mode, but
it does treat it as a distinct type. The Microsoft STL depends on this
functionality. (Not sure if this is intentional.) Add support for
parsing the same way as MSVC, and add some checks to ensure we don't try
to actually generate code.

The error handling in CodeGen is ugly, but I can't think of a better way
to do it.
2024-04-09 19:53:56 -07:00
YunQiang Su
1bce411073
MIPS/Clang: Set HasUnalignedAccess false if +strict-align (#87257)
TargetInfo has HasUnalignedAccess support now. For MIPSr6, we should set
it according strict-align.

For pre-R6, we always set strict-align and HasUnalignedAccess to false.
2024-04-04 21:51:25 +08:00
Jim Lin
4f0b5d5e80
[M68k] Change gcc register name from a7 to sp. (#87095)
In M68kRegisterInfo.td, register SP is defined with name sp and
alternate name a7.

Fixes: https://github.com/llvm/llvm-project/issues/78620
2024-04-02 11:32:15 -05:00
Nathan Sidwell
7df79ababe [clang] TargetInfo hook for unaligned bitfields (#65742)
Promote ARM & AArch64's HasUnaligned to TargetInfo and set for all
targets.
2024-03-29 09:35:31 -04:00
Brandon Wu
91896607ff
[RISCV] RISCV vector calling convention (1/2) (#77560)
[RISCV] RISCV vector calling convention (1/2)

    This is the vector calling convention based on
    https://github.com/riscv-non-isa/riscv-elf-psabi-doc,
    the idea is to split between "scalar" callee-saved registers
    and "vector" callee-saved registers. "scalar" ones remain the
    original strategy, however, "vector" ones are handled together
    with RVV objects.

    The stack layout would be:

      |--------------------------| <-- FP
      | callee-allocated save    |
      | area for register varargs|
      |--------------------------|
      | callee-saved registers   | <-- scalar callee-saved
      |        (scalar)          |
      |--------------------------|
      | RVV alignment padding    |
      |--------------------------|
      | callee-saved registers   | <-- vector callee-saved
      |        (vector)          |
      |--------------------------|
      | RVV objects              |
      |--------------------------|
      | padding before RVV       |
      |--------------------------|
      | scalar local variables   |
      |--------------------------| <-- BP
      | variable size objects    |
      |--------------------------| <-- SP

    Note: This patch doesn't contain "tuple" type, e.g. vint32m1x2.
          It will be handled in https://github.com/riscv-non-isa/riscv-elf-psabi-doc (2/2).

    Differential Revision: https://reviews.llvm.org/D154576
2024-03-27 23:03:13 +08:00
ostannard
ef395a492a
[AArch64] Add soft-float ABI (#84146)
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.

The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.

This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
2024-03-19 13:58:51 +00:00
Ahmed Bougacha
0481f049c3
[AArch64][PAC] Support ptrauth builtins and -fptrauth-intrinsics. (#65996)
This defines the basic set of pointer authentication clang builtins
(provided in a new header, ptrauth.h), with diagnostics and IRGen
support.  The availability of the builtins is gated on a new flag,
`-fptrauth-intrinsics`.

Note that this only includes the basic intrinsics, and notably excludes
`ptrauth_sign_constant`, `ptrauth_type_discriminator`, and
`ptrauth_string_discriminator`, which need extra logic to be fully
supported.

This also introduces clang/docs/PointerAuthentication.rst, which
describes the ptrauth model in general, in addition to these builtins.

Co-Authored-By: Akira Hatanaka <ahatanaka@apple.com>
Co-Authored-By: John McCall <rjmccall@apple.com>
2024-03-15 14:17:21 -07:00
yonghong-song
0e0bfacff7
[BPF] Add support for may_goto insn (#85358)
Alexei added may_goto insn in [1]. The asm syntax for may_goto looks
like
  may_goto <label>

The instruction represents a conditional branch but the condition is
implicit. Later in bpf kernel verifier, the 'may_goto <label>' insn will
be rewritten with an explicit condition. The encoding of 'may_goto' insn
is enforced in [2] and is also implemented in this patch.

In [3], 'may_goto' insn is encoded with raw bytes. I made the following
change
```
  --- a/tools/testing/selftests/bpf/bpf_experimental.h
  +++ b/tools/testing/selftests/bpf/bpf_experimental.h
  @@ -328,10 +328,7 @@ l_true:                                                                                            \

   #define cond_break                                     \
          ({ __label__ l_break, l_continue;               \
  -        asm volatile goto("1:.byte 0xe5;                       \
  -                     .byte 0;                          \
  -                     .long ((%l[l_break] - 1b - 8) / 8) & 0xffff;      \
  -                     .short 0"                         \
  +        asm volatile goto("may_goto %l[l_break]"       \
                        :::: l_break);                    \
          goto l_continue;                                \
          l_break: break;
```
and ran the selftest with the latest llvm with this patch. All tests are
passed.

[1]
https://lore.kernel.org/bpf/20240306031929.42666-1-alexei.starovoitov@gmail.com/
[2]
https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com/
[3]
https://lore.kernel.org/bpf/20240306031929.42666-4-alexei.starovoitov@gmail.com/
2024-03-15 07:24:28 -07:00
eddyz87
65b123e287
[BPF] rename 'arena' to 'address_space' (#85161)
There are a few places where `arena` name is used for pointers in
non-zero address space in BPF backend, rename these to use a more
generic `address_space`:
- macro `__BPF_FEATURE_ARENA_CAST` -> `__BPF_FEATURE_ADDR_SPACE_CAST
- name for arena global variables section `.arena.N` ->
`.addr_space.N`
2024-03-14 19:20:06 -07:00
4ast
2aacb56e83
BPF address space insn (#84410)
This commit aims to support BPF arena kernel side
[feature](https://lore.kernel.org/bpf/20240209040608.98927-1-alexei.starovoitov@gmail.com/):
- arena is a memory region accessible from both BPF program and
userspace;
- base pointers for this memory region differ between kernel and user
spaces;
- `dst_reg = addr_space_cast(src_reg, dst_addr_space, src_addr_space)`
translates src_reg, a pointer in src_addr_space to dst_reg, equivalent
pointer in dst_addr_space, {src,dst}_addr_space are immediate constants;
- number 0 is assigned to kernel address space;
- number 1 is assigned to user address space.

On the LLVM side, the goal is to make load and store operations on arena
pointers "transparent" for BPF programs:
- assume that pointers with non-zero address space are pointers to
  arena memory;
- assume that arena is identified by address space number;
- assume that address space zero corresponds to kernel address space;
- assume that every BPF-side load or store from arena is done via
pointer in user address space, thus convert base pointers using
`addr_space_cast(src_reg, 0, 1)`;

Only load, store, cmpxchg and atomicrmw IR instructions are handled by
this transformation.

For example, the following C code:

```c
   #define __as __attribute__((address_space(1)))
   void copy(int __as *from, int __as *to) { *to = *from; }
```

Compiled to the following IR:

```llvm
    define void @copy(ptr addrspace(1) %from, ptr addrspace(1) %to) {
    entry:
      %0 = load i32, ptr addrspace(1) %from, align 4
      store i32 %0, ptr addrspace(1) %to, align 4
      ret void
    }
```

Is transformed to:

```llvm
    %to2 = addrspacecast ptr addrspace(1) %to to ptr     ;; !
    %from1 = addrspacecast ptr addrspace(1) %from to ptr ;; !
    %0 = load i32, ptr %from1, align 4, !tbaa !3
    store i32 %0, ptr %to2, align 4, !tbaa !3
    ret void
```

And compiled as:

```asm
    r2 = addr_space_cast(r2, 0, 1)
    r1 = addr_space_cast(r1, 0, 1)
    r1 = *(u32 *)(r1 + 0)
    *(u32 *)(r2 + 0) = r1
    exit
```

Co-authored-by: Eduard Zingerman <eddyz87@gmail.com>
2024-03-13 02:27:25 +02:00
YunQiang Su
c88beb4112
MIPS: Fix asm constraints "f" and "r" for softfloat (#79116)
This include 2 fixes:
        1. Disallow 'f' for softfloat.
        2. Allow 'r' for softfloat.

Currently, 'f' is accpeted by clang, then LLVM meets an internal error.

'r' is rejected by LLVM by: couldn't allocate input reg for constraint
'r'.

Fixes: #64241, #63632

---------

Co-authored-by: Fangrui Song <i@maskray.me>
2024-02-26 22:08:36 -08:00