When `ptrauth_calls` is present but `ptrauth_init_fini` is not, compiler
emits raw unsigned pointers in `.init_array`/`.fini_array` sections.
Previously, `__do_init`/`__do_fini` pointers, which are explicitly added
to the sections, were implicitly signed (due to the presense of
`ptrauth_calls`), while all the other pointers in the sections were
implicitly added by the compiler and thus non-signed.. As a result, the
sections contained a mix of unsigned function pointers and function
pointers signed with default signing schema.
This patch introduces use of inline assembly for this particular case,
so we can manually specify that we do not want to sign the pointers.
Note that we cannot use `__builtin_ptrauth_strip` for this purpose since
its result is not a constant expression.
This reverts commit 0c0aa56cdcf1fe3970a5f3875db412530512fc07.
This time with the following fixes for buildbot failures:
- Add underscore prefixes to symbol names on Apple platforms.
- Modify the test so that it skips the crash tests on platforms where
they are not expected to pass:
- Platforms that implement FEAT_PAuth but not FEAT_FPAC (e.g.
Apple M1, Cortex-A78C)
- Platforms where DA key is disabled (e.g. older Linux kernels,
Linux kernels with PAC disabled, likely Windows)
Original commit message follows:
The emulated PAC runtime functions emulate the ARMv8.3a pointer
authentication instructions and are intended for use in heterogeneous
testing environments. For more information, see the associated RFC:
https://discourse.llvm.org/t/rfc-emulated-pac/85557
Reviewers: mstorsjo, pawosm-arm, atrosinenko
Reviewed By: atrosinenko
Pull Request: https://github.com/llvm/llvm-project/pull/148094
As well as followup "builtins: Speculative MSVC fix."
This reverts commits 5b1db59fb87b4146f827d17396f54ef30ae0dc40 and
f1c4df5b7bb79efb3e9be7fa5f8176506499d0a6.
Needs fixes for failing tests which will take time to implement.
Most pacbti instructions are a nop when the target does not have pacbti,
and thus safe to execute, but bxaut is an undefined instruction. When we
don't have pacbti (e.g. if we're compiling compiler-rt with
-mbranch-protection=standard in order to be forward-compatible with
pacbti while still working on targets without it) then we need to use
separate aut and bx instructions.
Currently, `ENABLE_BAREMETAL_AARCH64_FMV` is added to builtin defines
for all baremetal targets though it is only needed for aarch64. This
patch fixes this by adding it only for aarch64 target.
AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this
purpose, and says not to use X16 or X18 like GCC (and the previous
implementation) chose to use. The X18 register may need to get used by
the kernel in some circumstances, as specified by the platform ABI, so
it is generally an unwise choice. Simply choosing a different register
fixes the problem of this being broken on any platform that actually
follows the platform ABI (which is all of them except EABI, if I am
reading this linux kernel bug correctly
https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As
a side benefit, also generate slightly better code and avoids needing
the compiler-rt to be present. I did that by following the XCore
implementation instead of PPC (although in hindsight, following the
RISCV might have been slightly more readable). That X18 is wrong to use
for this purpose has been known for many years (e.g.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also
known that fixing this to use one of the correct registers is not an ABI
break, since this only appears inside of a translation unit. Some of the
other temporary registers (e.g. X9) are already reserved inside llvm for
internal use as a generic temporary register in the prologue before
saving registers, while X15 was already used in rare cases as a scratch
register in the prologue as well, so I felt that seemed the most logical
choice to choose here.
Some of the aeabi functions were missing PACBTI support. The lack of it
resulted in exceptions at runtime if the running environment had PAC
and/or BTI enabled.
This patch adds this support. This involves the addition of PACBTI
instructions, depending on whether each of these features is enabled,
plus the saving and restoring of the PAC code that resides in r12. Some
of the common code has been put in preprocessor macros to reduce
duplication, but not all, especially since some register saving and
restoring is very specific to each context.
Commit 75c3ff8c0b29f374d31ba99e51852f7f6851a6c8 inadvertently removed
some files from the build related to the SME ABI routines.
This patch fixes the issue by reintroducing the files to the build in
CMake.
The existing implementations, written in assembly, make use of unaligned
accesses for performance reasons. They are not compatible with strict
aligned configurations, i.e. with `-mno-unaligned-access`.
If the functions are used in this scenario, an exception is raised due
to unaligned memory accesses.
This patch reintroduces vanilla implementations for these functions to
be used in strictly aligned configurations. The actual code is largely
based on the code from https://github.com/llvm/llvm-project/pull/77496
This changes the save/restore procedures to save/restore registers one
by one - to match the stack alignment for the ILP32E/LP64E ABIs, rather
than the larger batches of the conventional ABIs. The implementations of
the save routines are not tail-shared, to reduce the number of
instructions. I think this also helps code size but I need to check this
again.
I would expect (but haven't measured) that the majority of functions
compiled for the ILP32E/LP64E ABIs will in fact use both callee-saved
registers, and therefore there are still savings to be had, but I think
those can come later, with more data (especially if those changes are
just to the instruction sequences we use to save the registers, rather
than the number and alignment of how this is done).
This is a potential break for all of the ILP32E/LP64E ABI - we may
instead have to teach the compiler to emit the CFI information correctly
for the grouping we already have implemented (because that grouping
matches GCC). It depends on how intentional we think the grouping is in
the original ILP32E/LP64E save/restore implementation was, and whether
we think we can fix that now.
On ARM64EC, function names and calls (but not address-taking or data
symbol references) use symbols prefixed with "#". Since it's an unique
behavior, introduce a new `FUNC_SYMBOL` macro instead of reusing
something like `SYMBOL_NAME`, which is also used for data symbols.
Based on patch by Billy Laws.
This patch introduces a new optional CMake flag:
COMPILER_RT_EXCLUDE_LIBC_PROVIDED_ARM_AEABI_BUILTINS
When enabled, this flag excludes the following ARM AEABI memory function
implementations from the compiler-rt build:
__aeabi_memcmp
__aeabi_memset
__aeabi_memcpy
__aeabi_memmove
These functions are already provided by standard C libraries like glibc,
newlib, and picolibc, so excluding them avoids duplicate symbol
definitions and reduces unnecessary code duplication.
Note:
- libgcc does not define the __aeabi_* functions that overlap with those
provided by the C library. Enabling this option makes compiler-rt behave
consistently with libgcc.
- This prevents duplicate symbol errors when linking, particularly in
bare-metal configurations where compiler-rt is linked first.
- This flag is OFF by default, meaning all AEABI memory builtins will
still be built unless explicitly excluded.
This change is useful for environments where libc provides runtime
routines, supporting more minimal, conflict free builds.
On 64-bit platforms, libgcc doesn't ship with __clzsi2, so __builtin_clz
gets lowered to __clzdi2. A check already exists for GCC, but as of
commit 8210ca019839fc5430b3a95d7caf5c829df3232a clang also lowers
__builtin_clz to __clzdi2 on sparc64.
Update the check so that building __clzdi2 with clang/sparc64 also
works.
This makes sure that COMPILER_RT_ARMHF_TARGET is set properly for
targets without a specific "armhf" target name, such as armv7 windows.
This fixes the builtins test comparesf2_test.c on Windows on armv7.
I'm trying to put together an LLVM built toolchain (including LLVM libc)
targeting UEFI, currently I get an error saying "Unknown target". This
PR enables compiling compiler-rt for UEFI.
- _Float16 is now accepted by Clang.
- The half IR type is fully handled by the backend.
- These values are passed in FP registers and converted to/from float around
each operation.
- Compiler-rt conversion functions are now built for s390x including the missing
extendhfdf2 which was added.
Fixes#50374
Summary:
Previously, we removed the special handling for the code object version
global. I erroneously thought that this meant we cold get rid of this
weird `-Xclang` option. However, this also emits an LLVM IR module flag,
which will then cause linking issues.
The try-compile mechanism requires that `CMAKE_REQUIRED_FLAGS` is a
space-separated string instead of a list of flags. The original code
expanded `BUILTIN_FLAGS` into `CMAKE_REQUIRED_FLAGS` as a
space-separated string and then would overwrite `CMAKE_REQUIRED_FLAGS`
with `TARGET_${arch}_CFLAGS` prepended to the unexpanded
`BUILTIN_CFLAGS_${arch}`. This resulted in the first two arguments being
passed into the try-compile invocation, but dropping the other arguments
listed in `BUILTIN_CFLAGS_${arch}`.
This patch appends `TARGET_${arch}_CFLAGS` and `BUILTIN_CFLAGS_${arch}` to
`CMAKE_REQUIRED_FLAGS` before expanding CMAKE_REQUIRED_FLAGS as a
space-separated string. This passes any pre-set required flags, in addition to
all of the builtin and target flags to the Float16 detection.
Summary:
When we were first porting to COV5, this lead to some ABI issues due to
a change in how we looked up the work group size. Bitcode libraries
relied on the builtins to emit code, but this was changed between
versions. This prevented the bitcode libraries, like OpenMP or libc,
from being used for both COV4 and COV5. The solution was to have this
'none' functionality which effectively emitted code that branched off of
a global to resolve to either version.
This isn't a great solution because it forced every TU to have this
variable in it. The patch in
https://github.com/llvm/llvm-project/pull/131033 removed support for
COV4 from OpenMP, which was the only consumer of this functionality.
Other users like HIP and OpenCL did not use this because they linked the
ROCm Device Library directly which has its own handling (The name was
borrowed from it after all).
So, now that we don't need to worry about backward compatibility with
COV4, we can remove this special handling. Users can still emit COV4
code, this simply removes the special handling used to make the OpenMP
device runtime bitcode version agnostic.
In soft floating-point ABI, this function takes the double argument as a
pair of registers r0 and r1.
The ordering of these two registers follow the endianness rules,
therefore the register on which the bit flipping must happen depends on
the endianness.
These cannot be detected by reading the ID_AA64ISAR1_EL1 register since
their corresponding bitfields are hidden. Additionally the instructions
that these features enable are unusable from EL0.
ACLE: https://github.com/ARM-software/acle/pull/382
This patch makes Arm builtins aware of endianness in VMOVs.
Before this patch, the functions' definitions assumed little endian,
which made any program compiled for big endian incorrect.
Our platform has some constraints that allow us to make assumptions that
aren't generally applicable to other platforms. We keep an entirely separate
.s file for the routines.
Avoid issues caused by `.subsections_via_symbols` directive, by using
numbered labels instead of named labels for the branch locations.
This reverts commit 4032ce3413d0230b0ccba1203536f9cb35e5c3b5.
Support platform-specific mangling to avoid the compiler emitting a call
to a function that is mangled differently than the definition in the
runtime library.
According to the conversation
[here](https://github.com/llvm/llvm-project/pull/119414#issuecomment-2536495859),
some platforms don't enable `__arm_cpu_features` with a global
constructor, but rather do so lazily when called from the FMV resolver.
PR #119414 removed the CMake guard to check to see if the targetted
platform is baremetal or supports sys/auxv. Without this check, the
routines rely on `__arm_cpu_features` being initialised when they may
not be, depending on the platform.
This PR simply avoids building the SME routines for those platforms for
now.
When #92921 added the `__arm_get_current_vg` functionality, it used the
FMV feature bits mechanism rather than the mechanism that was previously
added for SME which called `getauxval` on Linux platforms or
`__aarch64_sme_accessible` required for baremetal libraries. It is
better to always use `__aarch64_cpu_features`.
For baremetal we still need to rely on `__arm_sme_accessible` to
initialise the struct.
This patch simplifies the code in two different ways:
* When SVE is available, return `cntd` directly to avoid the need for
bitfield insert.
* When SME is available, check the PSTATE.SM bit of `SVCR` directly
rather than calling `__arm_sme_state`.
Given that FMV support is required for the SME builtins to be built, the
FMV constructor as defined in:
compiler-rt/lib/builtins/cpu_model/aarch64.c
already initialises the feature bits, so there's no need to create
another one.
It turns out we were not passing -m32 to the check_c_source_compiles()
invocation since CMAKE_REQUIRE_FLAGS needs to be string separated list
and
we were passing a ;-separated CMake list which appears to be parsed by
CMake as 'ignore all arguments beyond the first'.
Fix this by transforming the list to a command line first.
With this change, Clang 17 no longer claims to support _Float16 for
i386.