9018 Commits

Author SHA1 Message Date
Koakuma
c2fba6df94
[clang][SPARC] Treat empty structs as if it's a one-bit type in the CC (#90338)
Make sure that empty structs are treated as if it has a size of one bit
in function parameters and return types so that it occupies a full
argument and/or return register slot.

This fixes crashes and miscompilations when passing and/or returning
empty structs.

Reviewed by: @s-barannikov
2024-05-15 20:49:28 +07:00
Lukacma
421862f8e4
[Clang] Fix incorrect passing of _BitInt args (#90741)
This patch removes incorrect `byval` attribute from pointer argument
passed with >128 bit long _BitInt types.
2024-05-15 10:51:32 +01:00
Phoebe Wang
5bde8017a1
[X86][vectorcall] Pass built types byval when xmm0~6 exhausted (#91846)
This is how MSVC handles it. https://godbolt.org/z/fG386bjnf
2024-05-13 08:31:49 +08:00
Fangrui Song
e9f53e4095
[test] Move RISCV tests to clang/test/CodeGen/RISCV/
The directory was created by 2f1fe9a3a60d6f18998c5f3b7e643d4cbaa4e65d
(2020).

Pull Request: https://github.com/llvm/llvm-project/pull/91783
2024-05-10 13:22:07 -07:00
Momchil Velikov
2371a6410d
[AArch64] Add intrinsics for non-widening FMOPA/FMOPS (#88105)
According to the specification in
https://github.com/ARM-software/acle/pull/309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
2024-05-10 11:57:08 +01:00
Momchil Velikov
64d4ade3bb
[AArch64] Add intrinsics for 16-bit non-widening FMLA/FMLS (#88553)
According to the specification in
https://github.com/ARM-software/acle/pull/309
add the following intrinsics

void svmla[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm)
void svmla[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm)
void svmls[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm)
void svmls[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm)

void svmla_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16x2_t zm)
void svmla_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16x4_t zm)
void svmls_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16x2_t zm)
void svmls_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16x4_t zm)

void svmla_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmla_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmls_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmls_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm, uint64_t imm_idx)

as well as the corresponding `_bf16` variants.
2024-05-10 11:14:26 +01:00
Brendan Dahl
8a3277acbc
[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545)
Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory. Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is
incorrect and will be changed to 0xFC31 soon.
2024-05-09 15:38:13 -07:00
Tomas Matheson
ddad7c3c84
[AArch64] add some more tests for FMV (#91490)
Add a couple of tests to make it clear:
- when FMV should be enabled and disabled by the driver.
- which extensions are enabled/disabled based on the dependencies
specified in TargetParser.
2024-05-09 21:51:39 +01:00
Momchil Velikov
139e0aa68d
Revert "[AArch64] Add intrinsics for multi-vector to ZA array vector accumulators" (#91597)
Reverts llvm/llvm-project#88266  due to test failures

error: 'expected-error' diagnostics seen but not expected: 
(frontend): '-fsyntax-only' action ignored; '-emit-llvm' action
specified previously
2024-05-09 15:03:52 +01:00
Momchil Velikov
e88ba6d975
[AArch64] Add intrinsics for multi-vector to ZA array vector accumulators (#88266)
According to the specification in
https://github.com/ARM-software/acle/pull/309 this adds the intrinsics

void_svadd_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svadd_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
2024-05-09 14:30:53 +01:00
Daniil Kovalev
ad652efa1f
[AArch64][PAC][clang][ELF] Support PAuth ABI core info (#85235)
Depends on #87545

Emit PAuth ABI compatibility tag values as llvm module flags:
- `aarch64-elf-pauthabi-platform`
- `aarch64-elf-pauthabi-version`

For platform 0x10000002 (llvm_linux), the version value bits correspond
to the following LangOptions defined in #85232:

- bit 0: `PointerAuthIntrinsics`;
- bit 1: `PointerAuthCalls`;
- bit 2: `PointerAuthReturns`;
- bit 3: `PointerAuthAuthTraps`;
- bit 4: `PointerAuthVTPtrAddressDiscrimination`;
- bit 5: `PointerAuthVTPtrTypeDiscrimination`;
- bit 6: `PointerAuthInitFini`.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-05-09 15:32:18 +03:00
Lukacma
105dd60fc8
[Clang][AArch64] Fixed incorrect _BitInt alignment (#90602)
This patch makes determining alignment and width of BitInt to be target
ABI specific and makes it consistent with [Procedure Call Standard for
the Arm® 64-bit Architecture
(AArch64)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst)
for AArch64 targets.
2024-05-09 10:45:19 +01:00
Jacob Lambert
11a6799740
[clang][CodeGen] Omit pre-opt link when post-opt is link requested (#85672)
Currently, when the -relink-builtin-bitcodes-postop option is used we
link builtin bitcodes twice: once before optimization, and again after
optimization.

With this change, we omit the pre-opt linking when the option is set,
and we rename the option to the following:

  -Xclang -mlink-builtin-bitcodes-postopt
 (-Xclang -mno-link-builtin-bitcodes-postopt) 

The goal of this change is to reduce compile time. We do lose the
theoretical benefits of pre-opt linking, but in practice these are small
than the overhead of linking twice. However we may be able to address
this in a future patch by adjusting the position of the builtin-bitcode
linking pass.

Compilations not setting the option are unaffected
2024-05-08 08:11:15 -07:00
Freddy Ye
e44600f3ab
[X86][CFE] Support EGPR in GCCRegNames. (#91323) 2024-05-08 15:07:18 +08:00
Farzon Lotfi
31b45a9d0d
[clang][hlsl] Add tan intrinsic part 1 (#90276)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088

Changes:
- `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan
builtin.
-  `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin.
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses
of the builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin
with the equivalent hlsl apis
- `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as
HLSL specifc sema checks to the tan builtin
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
-  `llvm/docs/LangRef.rst` - Document the tan intrinsic
2024-05-07 22:54:15 -04:00
Max Winkler
3f37397c95
[clang][CodeGen] Fix MSVC ABI for classes with a deleted copy assignment operator (#90547)
For global functions and static methods the MSVC ABI returns
structs/classes with a deleted copy assignment operator indirectly.
From local testing this ABI holds true for all currently supported
architectures including ARM64EC.
2024-05-07 19:46:19 -04:00
Sean Perry
c9ab1d8905
Mark test cases as unsupported on z/OS (#90990)
These test cases are testing features not available when either
targeting the s390x-ibm-zos target or use tools/features not available
on the z/OS operating system. In a couple cases the lit test had a
number of subtests with one or two that aren't supported on z/OS. Rather
than mark the entire test as unsupported I split out the unsupported
tests into a separate test case.
2024-05-07 15:23:50 -04:00
Brendan Dahl
1a2a1fbd7c
[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906)
Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is
incorrect and will be changed to 0xFC30 soon.
2024-05-07 11:33:10 -07:00
Petr Hosek
8bcb073705
[Clang] -fseparate-named-sections option (#91028)
When set, the compiler will use separate unique sections for global
symbols in named special sections (e.g. symbols that are annotated with
__attribute__((section(...)))). Doing so enables linker GC to collect
unused symbols without having to use a different section per-symbol.
2024-05-07 09:18:55 -07:00
ostannard
1fd196c8df
[AArch64] Diagnose more functions when FP not enabled (#90832)
When using a hard-float ABI for a target without FP registers, it's not
possible to correctly generate code for functions with arguments which
must be passed in floating-point registers. This is diagnosed in CodeGen
instead of Sema, to more closely match GCC's behaviour around inline
functions, which is relied on by the Linux kernel.

Previously, this only checked function signatures as they were
code-generated, but this missed some cases:
* Calls to functions not defined in this translation unit.
* Calls through function pointers.
* Calls to variadic functions, where the variadic arguments have a
floating-point type.

This adds checks to function calls, as well as definitions, so that
these cases are correctly diagnosed.
2024-05-07 09:17:05 +01:00
Doug Wyatt
ddecadabeb
[clang backend] In AArch64's DataLayout, specify a minimum function alignment of 4. (#90702)
This addresses an issue where the explicit alignment of 2 (for C++ ABI
reasons) was being propagated to the back end and causing under-aligned
functions (in special sections).

This is an alternate approach suggested by @efriedma-quic in PR #90415.

Fixes #90358
2024-05-05 19:05:15 -07:00
Fangrui Song
d33937b623 [test] %clang_cc1: remove redundant actions
ParseFrontendArgs takes the last OPT_Action_Group option. The other
actions are overridden.
2024-05-05 11:42:04 -07:00
Fangrui Song
7e59223ac4 [test] %clang_cc1: remove redundant actions
ParseFrontendArgs takes the last OPT_Action_Group option. The other
actions are overridden.
2024-05-05 10:46:06 -07:00
Fangrui Song
7c1d9b15ee [test] %clang_cc1: remove redundant actions 2024-05-04 23:08:11 -07:00
Fangrui Song
3c311b0222 [test] %clang_cc1 -S: remove overridden -emit-llvm 2024-05-04 17:49:32 -07:00
Fangrui Song
a312dd68c0 [BPF,test] %clang_cc1 -emit-llvm: remove redundant -S 2024-05-04 17:37:36 -07:00
Fangrui Song
c4c3efa161 [test] %clang_cc1 -emit-llvm: remove redundant -S 2024-05-04 17:31:08 -07:00
Fangrui Song
0d501f38f3 [test] %clang_cc1 -emit-llvm: remove redundant -S
Also replace aarch64-none-linux-gnu (none can indicate an OS as well) with aarch64
2024-05-04 17:15:51 -07:00
Fangrui Song
c5de4dd1ea [test] %clang_cc1 -emit-llvm: remove redundant -S
And replace -emit-llvm -o - with -emit-llvm-only
2024-05-04 17:00:29 -07:00
Karl-Johan Karlsson
cb015b9ec9
[clang][CodeGen] Propagate pragma set fast-math flags to floating point builtins (#90377)
This is a fix for the issue #87758 where fast-math flags are not
propagated all builtins.

It seems like pragmas with fast math flags was only propagated to calls
of unary floating point builtins. This patch propagate them also for
binary and ternary floating point builtins.
2024-05-04 17:47:48 +02:00
Fangrui Song
f34a5205aa [clang,test] Convert text files from CRLF to LF
Skip files with intentional CRLF line endings.
2024-05-03 10:23:53 -07:00
Pavel Iliin
804202292b
[FMV][AArch64] Don't optimize backward compatible features in resolver. (#90928)
For arch64 features, such as Branch Target Identification or MTE (Memory
Tagging Extension), compatible with targets that lack their support we
may encounter scenarios where a binary compiled with MTE for example is
executed on both MTE and non-MTE hardware and we still need to detect at
runtime whether the MTE feature is available to choose the appropriate
function version.
So, we cannot optimize the function multi versioning resolver by
removing checks for these features enabled for the target during
compilation.
2024-05-03 18:07:17 +01:00
cor3ntin
642117105d
[Clang] Implement P2809: Trivial infinite loops are not Undefined Behavior (#90066)
https://wg21.link/P2809R3

This is applied as a DR to C++11 (C++98 did not guarantee forward
progress and is left untouched)

As an extension (and to preserve existing behavior in C), we consider
all controlling expression that can be constant folded
in the front end, not just standard constant expressions.
2024-05-03 14:10:54 +02:00
Pavel Iliin
ff210b94d4 [FMV][NFC] Add test for bti and mte check in resolver. 2024-05-03 00:58:17 +01:00
Björn Pettersson
7298ae3b6d
[clang][CodeGen] Fix in codegen for __builtin_popcountg/ctzg/clzg (#90845)
Make sure that the result from the popcnt/ctlz/cttz intrinsics is
unsigned casted to int, rather than casted as a signed value, when
expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg
builtins.

An example would be
  unsigned _BitInt(1) x = ...;
  int y = __builtin_popcountg(x);
which previously was incorrectly expanded to
  %1 = call i1 @llvm.ctpop.i1(i1 %0)
  %cast = sext i1 %1 to i32

Since the input type is generic for those "g" versions of the builtins
the intrinsic call may return a value for which the sign bit is set
(that could typically for BitInt of size 1 and 2). So we need to emit a
zext rather than a sext to avoid negative results.
2024-05-02 22:49:39 +02:00
zhijian lin
d4a25976df
Implement a subset of builtin_cpu_supports() features (#82809)
The PR implements a subset of features of function
__builtin_cpu_support() for AIX OS based on the information which AIX
kernel runtime variable `_system_configuration` and function call `getsystemcfg()` of
/usr/include/sys/systemcfg.h  in AIX OS can provide.

Following subset of features are supported in the PR

"arch_3_00", "arch_3_1","booke","cellbe","darn","dfp","dscr" ,"ebb","efpsingle","efpdouble","fpu","htm","isel",
"mma","mmu","pa6t","power4","power5","power5+","power6x","ppc32","ppc601","ppc64","ppcle","smt",
"spe","tar","true_le","ucache","vsx"
2024-05-02 14:59:33 -04:00
Krishna Narayanan
f17b1fb667
[Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (#89362)
Fixes #53079
2024-05-02 10:42:34 +01:00
Nikita Popov
74aa1abfae
[InstCombine] Canonicalize scalable GEPs to use llvm.vscale intrinsic (#90569)
Canonicalize getelementptr instructions for scalable vector types into
ptradd representation with an explicit llvm.vscale call. This
representation has better support in BasicAA, which can reason about
llvm.vscale, but not plain scalable GEPs.
2024-05-01 14:53:43 +09:00
wanglei
eb148aecb3
[LoongArch][Codegen] Add support for TLSDESC
The implementation only enables when the `-enable-tlsdesc` option is
passed and the TLS model is `dynamic`.

LoongArch's GCC has the same option(-mtls-dialet=) as RISC-V.

Reviewers: heiher, MaskRay, SixWeining

Reviewed By: SixWeining, MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/90159
2024-04-30 15:14:44 +08:00
Kees Cook
869ffcf3f6
[CodeGen][i386] Move -mregparm storage earlier and fix Runtime calls (#89707)
When building the Linux kernel for i386, the -mregparm=3 option is
enabled. Crashes were observed in the sanitizer handler functions, and
the problem was found to be mismatched calling convention.

As was fixed in commit c167c0a4dcdb ("[BuildLibCalls] infer inreg param
attrs from NumRegisterParameters"), call arguments need to be marked as
"in register" when -mregparm is set. Use the same helper developed there
to update the function arguments.

Since CreateRuntimeFunction() is actually part of CodeGenModule, storage
of the -mregparm value is also moved to the constructor, as doing this
in Release() is too late.

Fixes: https://github.com/llvm/llvm-project/issues/89670
2024-04-29 14:54:10 -07:00
Eli Friedman
3ab4ae9e58
[clang codegen] Fix MS ABI detection of user-provided constructors. (#90151)
In the context of determining whether a class counts as an "aggregate",
a constructor template counts as a user-provided constructor.

Fixes #86384
2024-04-29 12:00:12 -07:00
Lawrence Benson
bd07c22e53
[Clang] Add support for scalable vectors in __builtin_reduce_* functions (#87750)
Currently, a lot of `__builtin_reduce_*` function do not support
scalable vectors, i.e., ARM SVE and RISCV V. This PR adds support for
them. The main code change is to use a different path to extract the
type from the vectors, the rest is the same and LLVM supports the reduce
functions for `vscale` vectors.

This PR adds scalable vector support for:
- `__builtin_reduce_add`
- `__builtin_reduce_mul`
- `__builtin_reduce_xor`
- `__builtin_reduce_or`
- `__builtin_reduce_and`
- `__builtin_reduce_min`
- `__builtin_reduce_max`

Note: For all except `min/max`, the element type must still be an
integer value. Adding floating point support for `add` and `mul` is
still an open TODO.
2024-04-29 16:45:33 +02:00
nihui
cb3174bd78
[clang][CodeGen] fix UB in aarch64 bfloat16 scalar conversion (#89062)
do not bitcast 16bit `bfloat16` to 32bit `int32_t` directly
bitcast to `int16_t`, and then upcast to `int32_t`

Fix ASAN runtime error when calling vcvtah_f32_bf16
`==21842==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x007fda1dd063 at pc 0x005c0361c234 bp 0x007fda1dd030 sp 0x007fda1dd028
`

without patch
```c
__ai __attribute__((target("bf16"))) float32_t vcvtah_f32_bf16(bfloat16_t __p0) {
  float32_t __ret;
bfloat16_t __reint = __p0;
int32_t __reint1 = *(int32_t *) &__reint << 16;
  __ret = *(float32_t *) &__reint1;
  return __ret;
}
```

with this patch
```c
__ai __attribute__((target("bf16"))) float32_t vcvtah_f32_bf16(bfloat16_t __p0) {
  float32_t __ret;
bfloat16_t __reint = __p0;
int32_t __reint1 = (int32_t)(*(int16_t *) &__reint) << 16;
  __ret = *(float32_t *) &__reint1;
  return __ret;
}
```

fix issue https://github.com/llvm/llvm-project/issues/61983
2024-04-29 13:18:37 +01:00
Paul Walker
0fa1f1f2d1
[LLVM][SVE] Seperate the int and floating-point variants of addqv. (#89762)
We only use common intrinsics for operations that treat their element
type as a container of bits.
2024-04-26 11:25:55 +01:00
Andreas Jonson
93de97d750
[SCCP] Swap out range metadata to range attribute (#90134)
Also moved the range from the function's call sites to the functions
return value as that is possible now.
2024-04-26 11:04:47 +09:00
Andreas Jonson
b8f3024a31
[InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776)
Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of
range metadata for call instructions to range attributes.
2024-04-25 01:45:50 +08:00
Brandon Wu
418bdb49a7
[clang][RISCV] Remove LMUL=8 scalar input for some vector crypto instructions (#89867)
Since the requirement is EEW=32, it's impossible that EGW=128
needs LMUL=8.
2024-04-24 22:43:25 +08:00
Brandon Wu
3fa6b9c69e
[clang][RISCV] Support RVV bfloat16 C intrinsics (#89354)
It follows the interface defined here:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/293
2024-04-24 07:42:58 +08:00
Karl-Johan Karlsson
31480b0cc8
[test] Avoid writing to a potentially write-protected dir (#89242)
These tests just don't check the output written to the current directory. The
current directory may be write protected e.g. in a sandboxed environment.

The Testcases that use -emit-llvm and -verify only care about stdout/stderr
and are in this patch changed to use -emit-llvm-only to avoid writing to an
output file. The verify-inlineasmbr.mir testcase that also only care about
stdout/stderr is in this patch changed to throw away the output file and just
write to /dev/null.
2024-04-20 12:26:58 +02:00
Bill Wendling
c32712d176
[Clang] Handle structs with inner structs and no fields (#89126)
A struct that declares an inner struct, but no fields, won't have a
field count. So getting the offset of the inner struct fails. This
happens in both C and C++:

  struct foo {
    struct bar {
      int Quantizermatrix[];
    };
  };

Here 'struct foo' has no fields.

Closes: https://github.com/llvm/llvm-project/issues/88931
2024-04-19 19:48:33 +00:00