127 Commits

Author SHA1 Message Date
Alexandros Lamprineas
3ab64c5b29
[NFC][Clang][FMV] Make FMV priority data type future proof. (#150079)
FMV priority is the returned value of a polymorphic function. On RISC-V
and X86 targets a 32-bit value is enough. On AArch64 we currently need
64 bits and we will soon exceed that. APInt seems to be a suitable
replacement for uint64_t, presumably with minimal compile time overhead.
It allows bit manipulation, comparison and variable bit width.
2025-07-23 10:37:29 +01:00
Eli Friedman
2aa0f0a3bd
[AArch64] Add option -msve-streaming-vector-bits= . (#144611)
This is similar to -msve-vector-bits, but for streaming mode: it
constrains the legal values of "vscale", allowing optimizations based on
that constraint.

This also fixes conversions between SVE vectors and fixed-width vectors
in streaming functions with -msve-vector-bits and
-msve-streaming-vector-bits.

This rejects any use of arm_sve_vector_bits types in streaming
functions; if it becomes relevant, we could add
arm_sve_streaming_vector_bits types in the future.

This doesn't touch the __ARM_FEATURE_SVE_BITS define.
2025-07-03 13:44:38 -07:00
Sander de Smalen
cd10ded697
[Clang] Remove AArch64TargetInfo::setArchFeatures (#146107)
When compiling with `-march=armv9-a+nosve` we found that Clang still
defines the `__ARM_FEATURE_SVE2` macro, which is explicitly set in
`setArchFeatures` when compiling for armv9-a.

After some experimenting, I found out that the list of features passed
into `AArch64TargetInfo::handleTargetFeatures` has already been expanded
and takes into account `+no[feature]` and has already expanded features
like `armv9-a`.

From that I conclude that `setArchFeatures` is no longer required.
2025-07-01 10:20:40 +01:00
Martin Wehking
fbea0fc5c7
Add Macro for CSSC Feature (#143148)
Add a new __ARM_FEATURE_CSSC macro that can be utilized during the
preprocessing stage.

__ARM_FEATURE_CSSC is defined to 1 if there is hardware support for
CSSC.

Implements the ACLE change:
https://github.com/ARM-software/acle/pull/394
2025-06-13 13:33:46 +01:00
Nathan Gauër
20d70196c9
[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530)
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:

17727e88fd/proposals/0011-inline-spirv.md

Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.

The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md


Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-06-04 13:22:37 +02:00
CarolineConcatto
7569de5272
[Clang][AArch64]Add FP8 ACLE macros implementation (#140591)
This patch implements the macros described in the ACLE[1]

[1]
https://github.com/ARM-software/acle/blob/main/main/acle.md#modal-8-bit-floating-point-extensions
2025-05-27 10:01:38 +01:00
Matthew Devereau
22576e2cce
[Clang][AArch64] Add pessimistic vscale_range for sve/sme (#137624)
The "target-features" function attribute is not currently considered
when adding vscale_range to a function. When +sve/+sme are pushed onto
functions with "#pragma attribute push(+sve/+sme)", the function
potentially misses out on optimizations that rely on vscale_range being
present.
2025-05-16 09:39:07 +01:00
Steven Perron
c073c22865
[HLSL] Use hlsl_device address space for getpointer. (#127675)
We add the hlsl_device address space to represent the device memory
space as defined in section 1.7.1.3 of the [HLSL
spec](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf).

Fixes https://github.com/llvm/llvm-project/issues/127075
2025-04-22 13:26:32 -04:00
Nathan Gauër
a625bc60e2
[HLSL][SPIR-V] Add hlsl_private address space for SPIR-V (#133464)
This is an alternative to
https://github.com/llvm/llvm-project/pull/122103

In SPIR-V, private global variables have the Private storage class. This
PR adds a new address space which allows frontend to emit variable with
this storage class when targeting this backend.

This is covered in this proposal: llvm/wg-hlsl@4c9e11a

This PR will cause addrspacecast to show up in several cases, like class
member functions or assignment. Those will have to be handled in the
backend later on, particularly to fixup pointer storage classes in some
functions.

Before this change, global variable were emitted with the 'Function'
storage class, which was wrong.
2025-04-10 10:55:10 +02:00
Daniil Kovalev
84b0c128a7
[PAC] Do not support some values of branch-protection with ptrauth-returns (#125280)
This patch does two things.

1. Previously, when checking driver arguments, we emitted an error for
unsupported values of `-mbranch-protection` when using pauthtest ABI.
The reason for that was ptrauth-returns being enabled as part of
pauthtest. This patch changes the check against pauthtest to a check
against ptrauth-returns.

2. Similarly, check against values of the following function attribute
which are unsupported with ptrauth-returns:
`__attribute__((target("branch-protection=XXX`. Note that existing
`validateBranchProtection` function is used, and current behavior is to
ignore the unsupported attribute value, so no error is emitted.
2025-02-05 11:39:27 +03:00
Chandler Carruth
cd269fee05 [StrTable] Switch Clang builtins to use string tables
This both reapplies #118734, the initial attempt at this, and updates it
significantly.

First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.

It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:

1) It improves the APIs used significantly.

2) When builtins are defined from different sources (like SVE vs MVE in
   AArch64), this allows each of them to build their own string table
   independently rather than having to merge the string tables and info
   structures.

3) It allows each shard to factor out a common prefix, often cutting the
   size of the strings needed for the builtins by a factor two.

The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.

Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
2025-02-04 18:04:57 +00:00
David Green
9f1c825fb6
[AArch64] Enable vscale_range with +sme (#124466)
If we have +sme but not +sve, we would not set vscale_range on
functions. It should be valid to apply it with the same range with just
+sme, which can help mitigate some performance regressions in cases such
as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d).
2025-01-31 07:57:43 +00:00
David Majnemer
bce2cc1513 [clang] Set __GCC_*STRUCTIVE_SIZE on Aarch64
Before this change, we would set this to Clang's default of {64, 64}.

Now, we explicitly set it to {256, 64} which matches our ARM behavior
for ARMv8 targets and GCC's behavior for AArch64 targets.
2025-01-31 00:11:27 +00:00
Helena Kotas
d92bac8a3e
[HLSL] Introduce address space hlsl_constant(2) for constant buffer declarations (#123411)
Introduces a new address space `hlsl_constant(2)` for constant buffer
declarations.

This address space is applied to declarations inside `cbuffer` block.
Later on, it will also be applied to `ConstantBuffer<T>` syntax and the
default `$Globals` constant buffer.

Clang codegen translates constant buffer declarations to global
variables and loads from `hlsl_constant(2)` address space. More work
coming soon will include addition of metadata that will map these
globals to individual constant buffers and enable their transformation
to appropriate constant buffer load intrinsics later on in an LLVM pass.

Fixes #123406
2025-01-24 16:48:35 -08:00
CarolineConcatto
9256485043
[Clang][LLVM][AArch64]Add new feature SSVE-BitPerm (#121947)
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm,
which allows the sve-bitperm instructions to run in streaming mode.

It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The
sve2-bitperm feature is now an alias for sve-bitperm and sve2.

A new feature flag sve-bitperm is added to reflect the change that the
instructions under FEAT_SVE_BitPerm are supported if:
 on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
 in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
2025-01-13 16:34:33 +00:00
Ian Anderson
8a1174f06c
[Darwin][Driver][clang] arm64-apple-none-macho is missing the Apple macros from arm-apple-none-macho (#122427)
arm-apple-none-macho uses DarwinTargetInfo which provides several Apple
specific macros. arm64-apple-none-macho however just uses the generic
AArch64leTargetInfo and doesn't get any of those macros. It's not clear
if everything from DarwinTargetInfo is desirable for
arm64-apple-none-macho, so make an AppleMachOTargetInfo to hold the
generic Apple macros and a few other basic things.
2025-01-10 15:50:54 -08:00
Alexandros Lamprineas
8e65940161
[FMV][AArch64] Simplify version selection according to ACLE. (#121921)
Currently, the more features a version has, the higher its priority is.
We are changing ACLE https://github.com/ARM-software/acle/pull/370 as
follows:

"Among any two versions, the higher priority version is determined by
 identifying the highest priority feature that is specified in exactly
 one of the versions, and selecting that version."
2025-01-08 18:59:07 +00:00
Chandler Carruth
ca79ff07d8
Revert "Switch builtin strings to use string tables" (#119638)
Reverts llvm/llvm-project#118734

There are currently some specific versions of MSVC that are miscompiling
this code (we think). We don't know why as all the other build bots and
at least some folks' local Windows builds work fine.

This is a candidate revert to help the relevant folks catch their
builders up and have time to debug the issue. However, the expectation
is to roll forward at some point with a workaround if at all possible.
2024-12-13 23:58:48 -08:00
Chandler Carruth
be2df95e92
Switch builtin strings to use string tables (#118734)
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
2024-12-08 19:00:14 -08:00
Nathan Gauër
f8b4182f07
Revert "[SPIR-V] Fixup storage class for global private (#116636)" (#118312)
This reverts commit aa7fe1c10e5d6d0d3aacdb345fed995de413e142.
2024-12-02 17:32:54 +01:00
Nathan Gauër
aa7fe1c10e
[SPIR-V] Fixup storage class for global private (#116636)
Adds a new address spaces: `hlsl_private`. Variables with such address
space will be emitted with a `Private` storage class.
This is useful for variables global to a SPIR-V module, since up to now,
they were still emitted with a `Function` storage class, which is wrong.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-12-02 16:17:44 +01:00
Alexandros Lamprineas
88c2af80fa
[NFC][clang][FMV][TargetInfo] Refactor API for FMV feature priority. (#116257)
Currently we have code with target hooks in CodeGenModule shared between
X86 and AArch64 for sorting MultiVersionResolverOptions. Those are used
when generating IFunc resolvers for FMV. The RISCV target has different
criteria for sorting, therefore it repeats sorting after calling
CodeGenFunction::EmitMultiVersionResolver.

I am moving the FMV priority logic in TargetInfo, so that it can be
implemented by the TargetParser which then makes it possible to query it
from llvm. Here is an example why this is handy:
https://github.com/llvm/llvm-project/pull/87939
2024-11-28 09:22:05 +00:00
SpencerAbson
748b028540
[AArch64] Make +sve2-aes an alias of +sve2+sve-aes (#116026)
This patch essentially re-lands
https://github.com/llvm/llvm-project/pull/114293 with the following
fixups

- `nosve2-aes` should disable the backend feature `FeatureSVEAES` such
that the set of existing instructions that this removes is unchanged.
- FMV dependencies now use the autogenerated `ExtensionDepencies`
structure (since https://github.com/llvm/llvm-project/pull/113281) so we
do not require the change to `AArch64FMV.td`.
2024-11-14 11:04:04 +00:00
SpencerAbson
bbcd35270e
Revert "[AArch64] Reduce +sve2-aes to an alias of +sve-aes+sve2 (#114… (#115539)
…293)"

This reverts commit da9499ebfb323602c42aeb674571fe89cec20ca6.
2024-11-08 20:19:31 +00:00
SpencerAbson
da9499ebfb
[AArch64] Reduce +sve2-aes to an alias of +sve-aes+sve2 (#114293)
This patch introduces the amended feature flag for
[FEAT_SVE_AES](https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-0-architecture-extension?lang=en#md457-the-armv90-architecture-extension__feat_FEAT_SVE_AES),
'**sve-aes**'. The existing flag associated with this feature,
'sve2-aes' must be retained as an alias of 'sve-aes' and 'sve2' for
backwards compatibility.

The
[ACLE](https://github.com/ARM-software/acle/blob/main/main/acle.md#aes-extension)
documents `__ARM_FEATURE_SVE2_AES`, which was previously defined to 1
when

> there is hardware support for the SVE2 AES (FEAT_SVE_AES) instructions
and if the associated ACLE intrinsics are available.

The front-end has been amended such that it is compatible with +sve2-aes
and +sve2+sve-aes.
2024-11-08 15:07:05 +00:00
Aaron Ballman
af7c58b7ea
Remove support for RenderScript (#112916)
See
https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284
for the RFC
2024-10-28 12:48:42 -04:00
Daniel Paoliello
c9f27275c1
[clang][aarch64] Add support for the MSVC qualifiers __ptr32, __ptr64, __sptr, __uptr for AArch64 (#111879)
MSVC has a set of qualifiers to allow using 32-bit signed/unsigned
pointers when building 64-bit targets. This is useful for WoW code
(i.e., the part of Windows that handles running 32-bit application on a
64-bit OS). Currently this is supported on x64 using the 270, 271 and
272 address spaces, but does not work for AArch64 at all.

This change adds the same 270, 271 and 272 address spaces to AArch64 and
adjusts the data layout string accordingly. Clang will generate the
correct address space casts, but these will currently be ignored until
the AArch64 backend is updated to handle them.

Partially fixes #62536

This is a resurrected version of <https://reviews.llvm.org/D158857>
(originally created by @a_vorobev) - I've cleaned it up a little, fixed
the rest of the tests and added to auto-upgrade for the data layout.
2024-10-15 10:37:36 -07:00
Jonathan Thackray
d0756caedc
[ARM][AArch64] Introduce the Armv9.6-A architecture version (#110825)
This introduces the Armv9.6-A architecture version, including the
relevant command-line option for -march.

More details about the Armv9.6-A architecture version can be found at:
  * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2024
  * https://developer.arm.com/documentation/ddi0602/2024-09/
2024-10-04 10:12:41 +01:00
SpencerAbson
2617023923
[clang][AArch64] Add SME2.1 feature macros (#105657) 2024-08-23 14:27:49 +01:00
SpencerAbson
39185da162
[Clang][AArch64] Add missing SME/SVE2.1 feature macros (#98285)
The 2022 SME2.1and SVE2.1 feature macros are missing from Clang. Passing
'-target-feature +sve2p1' and 'target-feature +sme2p1' should prompt
Clang to define __ARM_FEATURE_SVE2p1 and __ARM_FEATURE_SME2p1
respectively, including their prerequisits..

This patch includes __ARM_FEATURE_SVE2p1 and __ARM_FEATURE_SME2p1, plus
a clang preprocessor test for each. It also ensures that the Clang macro
builder is used in a consistent fashion across Targets/AArch64.cpp.

The specification for SVE2.1 is documented in the latest (2024 Q1) ACLE
release: https://github.com/ARM-software/acle/releases . SME2p1 is not
yet featured in ACLE documentation but its features are described under
https://developer.arm.com/documentation/ddi0487/latest/
2024-07-19 09:54:52 +01:00
Tomas Matheson
fa6d38d61a
[AArch64][TargetParser] Split FMV and extensions (#92882)
FMV extensions are really just mappings from FMV feature names to lists
of backend features for codegen. Split them out into their own separate
file.
2024-06-20 15:33:21 +01:00
Alexandros Lamprineas
a03d06a736
Reland "[AArch64] Decouple feature dependency expansion. (#94279)" (#95519)
This is the second attempt. When parsing the target attribute
we should be letting cc1 features which don't correspond to
Extensions pass through to avoid errors like the following:

% cat neon.c
__attribute__((target("arch=armv8-a")))
uint64x2_t foo(uint64x2_t a, uint64x2_t b) { return veorq_u64(a, b); }

% clang --target=aarch64-linux-gnu -c neon.c
error: always_inline function 'veorq_u64' requires target feature
       'outline-atomics', but would be inlined into function 'foo'
       that is compiled without support for 'outline-atomics'

Co-authored-by: Tomas Matheson <Tomas.Matheson@arm.com>
2024-06-18 21:28:34 +01:00
Daniel Kiss
5fe7f7364a
[clang][AArch64] Add validation for Global Register Variable. (#94271)
Fixes: #76426
2024-06-17 08:48:53 +02:00
Fangrui Song
2146fd0d8d Revert "Reland "[AArch64] Decouple feature dependency expansion. (#94279)" (#95231)"
This reverts commit 70510733af33c70ff7877eaf30d7718b9358a725.

The following code is now incorrectly rejected.

```
% cat neon.c
#include <arm_neon.h>
__attribute__((target("arch=armv8-a")))
uint64x2_t foo(uint64x2_t a, uint64x2_t b) {
  return veorq_u64(a, b);
}
% newclang --target=aarch64-linux-gnu -c neon.c
neon.c:5:10: error: always_inline function 'veorq_u64' requires target feature 'outline-atomics', but would be inlined into function 'foo' that is compiled without support for 'outline-atomics'
    5 |   return veorq_u64(a, b);
      |          ^
1 error generated.
```

"+outline-atomics" seems misleading here.
2024-06-13 11:49:22 -07:00
Alexandros Lamprineas
70510733af
Reland "[AArch64] Decouple feature dependency expansion. (#94279)" (#95231)
My reverted attempt to decouple feature dependency expansion (see
#95056) made it evident that some features are still using the FMV
dependencies in the target attribute.

The original commit broke the llvm test suite. This was addressed here:
https://github.com/llvm/llvm-test-suite/pull/133. I am now relanding it.
2024-06-12 16:07:35 +01:00
Alexandros Lamprineas
48aebd4cf8
Revert "[AArch64] Decouple feature dependency expansion. (#94279)" (#95056)
This reverts commit 2cf14398c9341feddb419e7ff9c8c5623a3da3db since it
broke the llvm test suite:

SingleSource/UnitTests/AArch64/acle-fmv-features.c:59:9:
  error: instruction requires: altnzcv
SingleSource/UnitTests/AArch64/acle-fmv-features.c:117:10:
  error: instruction requires: aes
...

Looks like the FMV dependencies were used in the target attribute and
now features that are FMVOnly (have AEK_NONE) cannot be expanded in
parseTargetAttr using the ExtensionSet.

This suggests that either the tests are wrong (they are using an FMVOnly
feature in a target attribute), or that we need to turn the FMVOnly
features into Extensions (these two are tablegen classes).
2024-06-11 00:51:52 +01:00
Alexandros Lamprineas
2cf14398c9
[AArch64] Decouple feature dependency expansion. (#94279)
The dependency expansion step which was introduced by FMV has been
erroneously used for non-FMV features, for example when parsing the
target attribute. The PR #93695 has rectified most of the tests which
were relying on dependency expansion of target features specified on the
-cc1 command line. In this patch I am decoupling the dependency
expansion of features specified on the target attribute from FMV.

To do that first I am expanding FMV dependencies before passing the list
of target features to initFeatureMap(). Similarly when parsing the
target attribute I am reconstructing an ExtensionSet from the list of
target features which was created during the command line option
parsing. The attribute parsing may toggle bits of that ExtensionSet and
at the end it is converted to a list of target features. Those are
passed to initFeatureMap(), which no longer requires an override.

A side effect of this refactoring is that features specified on the
target_version attribute now supersede the command line options, which
is what should be happening in the first place.
2024-06-10 13:53:14 +01:00
Nathan Sidwell
7df79ababe [clang] TargetInfo hook for unaligned bitfields (#65742)
Promote ARM & AArch64's HasUnaligned to TargetInfo and set for all
targets.
2024-03-29 09:35:31 -04:00
ostannard
ef395a492a
[AArch64] Add soft-float ABI (#84146)
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.

The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.

This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
2024-03-19 13:58:51 +00:00
Ahmed Bougacha
0481f049c3
[AArch64][PAC] Support ptrauth builtins and -fptrauth-intrinsics. (#65996)
This defines the basic set of pointer authentication clang builtins
(provided in a new header, ptrauth.h), with diagnostics and IRGen
support.  The availability of the builtins is gated on a new flag,
`-fptrauth-intrinsics`.

Note that this only includes the basic intrinsics, and notably excludes
`ptrauth_sign_constant`, `ptrauth_type_discriminator`, and
`ptrauth_string_discriminator`, which need extra logic to be fully
supported.

This also introduces clang/docs/PointerAuthentication.rst, which
describes the ptrauth model in general, in addition to these builtins.

Co-Authored-By: Akira Hatanaka <ahatanaka@apple.com>
Co-Authored-By: John McCall <rjmccall@apple.com>
2024-03-15 14:17:21 -07:00
Pavel Iliin
568babab7e
[AArch64] Implement __builtin_cpu_supports, compiler-rt tests. (#82378)
The patch complements https://github.com/llvm/llvm-project/pull/68919
and adds AArch64 support for builtin
`__builtin_cpu_supports("feature1+...+featureN")`
which return true if all specified CPU features in argument are
detected. Also compiler-rt aarch64 native run tests for features
detection mechanism were added and 'cpu_model' check was fixed after its
refactor merged https://github.com/llvm/llvm-project/pull/75635 Original
RFC was https://reviews.llvm.org/D153153
2024-02-22 23:33:54 +00:00
Prabhuk
ea9ec80b7a
Revert "[AArch64] Add soft-float ABI (#74460)" (#82032)
This reverts commit 9cc98e336980f00cbafcbed8841344e6ac472bdc.

Issue: https://github.com/ClangBuiltLinux/linux/issues/1997
2024-02-16 16:43:50 -08:00
ostannard
9cc98e3369
[AArch64] Add soft-float ABI (#74460)
This adds support for the AArch64 soft-float ABI. The specification for
this ABI was added by https://github.com/ARM-software/abi-aa/pull/232.

Because all existing AArch64 hardware has floating-point hardware, we
expect this to be a niche option, only used for embedded systems on
R-profile systems. We are going to document that SysV-like systems
should only ever use the base (hard-float) PCS variant:
https://github.com/ARM-software/abi-aa/pull/233. For that reason, I've
not added an option to select the ABI independently of the FPU hardware,
instead the new ABI is enabled iff the target architecture does not have
an FPU.

For testing, I have run this through an ABI fuzzer, but since this is
the first implementation it can only test for internal consistency
(callers and callees agree on the PCS), not for conformance to the ABI
spec.
2024-02-15 12:39:16 +00:00
Sander de Smalen
9e649518e6
[Clang][AArch64] Add missing SME macros (#80293)
__ARM_STATE_ZA and __ARM_STATE_ZT0 are set when the compiler can parse 
the "za" and "zt0" strings in the SME attributes.

__ARM_FEATURE_SME and __ARM_FEATURE_SME2 are set when the compiler can 
generate code for attributes with "za" and "zt0" state, respectively.

__ARM_FEATURE_LOCALLY_STREAMING is set when the compiler supports the
__arm_locally_streaming attribute.
2024-02-02 09:29:47 +00:00
Lucas Duarte Prates
1bbb797e9c
[Clang][AArch64] Add ACLE macros for FEAT_PAuth_LR (#80163)
This updates clang's target defines to include the ACLE changes covering
the FEAT_PAuth_LR architecture extension.
The changes include:
* The new `__ARM_FEATURE_PAUTH_LR` feature macro, which is set to 1 when
  FEAT_PAuth_LR is available in the target.
* A new bit field for the existing `__ARM_FEATURE_PAC_DEFAULT` macro,
  indicating the use of PC as a diversifier for Pointer Authentication
  (from -mbranch-protection=pac-ret+pc).

The approved changes to the ACLE spec can be found here:
https://github.com/ARM-software/acle/pull/292
2024-02-01 10:24:38 +00:00
Jonas Paulsson
34dd8ec8ae
[clang, SystemZ] Support -munaligned-symbols (#73511)
When this option is passed to clang, external (and/or weak) symbols
are not assumed to have the minimum ABI alignment normally required.
Symbols defined locally that are not weak are however still given the
minimum alignment.

This is implemented by passing a new parameter to getMinGlobalAlign()
named HasNonWeakDef that is used to return the right alignment value.

This is needed when external symbols created from a linker script may
not get the ABI minimum alignment and must therefore be treated as
unaligned by the compiler.
2024-01-27 18:29:37 +01:00
Sam Tebbs
567941bcc3 [Clang][SME] Remove unused HasSVE2p1 variable
Removes the HasSVE2p1 variable to stop a warning from https://github.com/llvm/llvm-project/pull/76975
2024-01-05 11:17:56 +00:00
Sam Tebbs
0eefcaf96d
[Clang][SME] Add IsStreamingOrSVE2p1 (#76975)
This patch adds IsStreamingOrSVE2p1 to the applicable builtins and a
warning for when those builtins are not used in a streaming or sve2p1
function.
2024-01-05 09:55:50 +00:00
Sam Tebbs
a7a78fd427
Revert "[Clang][SME] Add IsStreamingOrSVE2p1" (#76973)
Reverts llvm/llvm-project#75958

I mistakenly included a commit from my local main after rebasing.
2024-01-04 16:53:14 +00:00
Sam Tebbs
8f8152091c
[Clang][SME] Add IsStreamingOrSVE2p1 (#75958)
This patch adds IsStreamingOrSVE2p1 to the applicable builtins and a
warning for when those builtins are not used in a streaming or sve2p1
function.
2024-01-04 16:50:31 +00:00