C99 introduced macros of the form `INTn_C(v)` which expand to a signed
or unsigned integer constant with the specified value `v` and the type
`int_leastN_t`. Clang's initial implementation of these macros used
token pasting to form the integer constants, but this means that users
cannot define a macro named `L`, `U`, `UL`, etc before including
`<stdint.h>` (in freestanding mode, where Clang's header is being used)
because that could form invalid token pasting results. The new
definitions now use the predefined `__INTn_C` macros instead of using
token pasting. This matches the behavior of GCC.
Fixes#85995
StringLiteral is used as internal data of EmbedExpr and we directly use
it as an initializer if a single EmbedExpr appears in the initializer
list of a char array. It is fast and convenient, but it is causing
problems when string literal character values are checked because #embed
data values are within a range [0-2^(char width)] but ordinary
StringLiteral is of maybe signed char type.
This PR introduces new kind of StringLiteral to hold binary data coming
from an embedded resource to mitigate these problems. The new kind of
StringLiteral is not assumed to have signed char type. The new kind of
StringLiteral also helps to prevent crashes when trying to find
StringLiteral token locations since these simply do not exist for binary
data.
Fixes https://github.com/llvm/llvm-project/issues/119256
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v4. Define 2 new flags:
BPF_LOAD_ACQ 0x100
BPF_STORE_REL 0x110
A "load-acquire" is a BPF_STX | BPF_ATOMIC instruction with the 'imm'
field set to BPF_LOAD_ACQ (0x100).
Similarly, a "store-release" is a BPF_STX | BPF_ATOMIC instruction with
the 'imm' field set to BPF_STORE_REL (0x110).
Unlike existing atomic read-modify-write operations that only support
BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and
store-releases also support BPF_B (8-bit) and BPF_H (16-bit). An 8- or
16-bit load-acquire zero-extends the value before writing it to a 32-bit
register, just like ARM64 instruction LDAPRH and friends.
As an example (assuming little-endian):
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
db 10 00 00 00 01 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
opcode (0xdb): BPF_ATOMIC | BPF_DW | BPF_STX
imm (0x00000100): BPF_LOAD_ACQ
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
cb 21 00 00 10 01 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
opcode (0xcb): BPF_ATOMIC | BPF_H | BPF_STX
imm (0x00000110): BPF_STORE_REL
Inline assembly is also supported.
Add a pre-defined macro, __BPF_FEATURE_LOAD_ACQ_STORE_REL, to let
developers detect this new feature. It can also be disabled using a new
llc option, -disable-load-acq-store-rel.
Using __ATOMIC_RELAXED for __atomic_store{,_n}() will generate a "plain"
store (BPF_MEM | BPF_STX) instruction:
void foo(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELAXED);
}
6b 21 00 00 00 00 00 00 *(u16 *)(r1 + 0x0) = w2
95 00 00 00 00 00 00 00 exit
Similarly, using __ATOMIC_RELAXED for __atomic_load{,_n}() will generate
a zero-extending, "plain" load (BPF_MEM | BPF_LDX) instruction:
int foo(char *ptr) {
return __atomic_load_n(ptr, __ATOMIC_RELAXED);
}
71 11 00 00 00 00 00 00 w1 = *(u8 *)(r1 + 0x0)
bc 10 08 00 00 00 00 00 w0 = (s8)w1
95 00 00 00 00 00 00 00 exit
Currently __ATOMIC_CONSUME is an alias for __ATOMIC_ACQUIRE. Using
__ATOMIC_SEQ_CST ("sequentially consistent") is not supported yet and
will cause an error:
$ clang --target=bpf -mcpu=v4 -c bar.c > /dev/null
bar.c:1:5: error: sequentially consistent (seq_cst) atomic load/store is
not supported
1 | int foo(int *ptr) { return __atomic_load_n(ptr, __ATOMIC_SEQ_CST); }
| ^
...
Finally, rename those isST*() and isLD*() helper functions in
BPFMISimplifyPatchable.cpp based on what the instructions actually do,
rather than their instruction class.
[1]
https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
Add an option similar to the -qtarget option in XL to allow the user to
say they want to be able to run the generated program on an older
version of the LE environment. This option will do two things:
- set the `__TARGET_LIBS` macro so the system headers exclude newer
interfaces when targeting older environments
- set the arch level to match the minimum arch level for that older
version of LE. It doesn't happen right now since all of the supported LE
versions have a the same minimum ach level. So the option doesn't change
this yet.
The user can specify three different kinds of arguments:
1. -mzos-target=zosv*V*r*R* - where V & R are the version and release
2. -mzos-target=0x4vrrmmmm - v, r, m, p are the hex values for the
version, release, and modlevel
3. -mzos-target=current - uses the latest version of LE the system
headers have support for
The `-fcf-protection` flag is now also used to enable CFI features for
the RISC-V target, so it's not suitable to define `__CET__` solely based
on the flag anymore. This patch moves the definition of the `__CET__`
macro into X86 target hook, so only X86 targets with the
`-fcf-protection` flag would enable the `__CET__` macro.
See https://github.com/llvm/llvm-project/pull/109784 and
https://github.com/llvm/llvm-project/pull/112477 for the adoption
of `-fcf-protection` flag for RISC-V targets.
The `-fcf-protection=[full|return]` flag enables shadow stack
implementation based on RISC-V Zicfiss extension. This patch adds the
`__riscv_shadow_stack` predefined macro to preprocessing when such a
shadow stack implementation is enabled.
When bytes with negative signed char values appear in the data, make
sure to use raw bytes from the data string when preprocessing, not char
values.
Fixes https://github.com/llvm/llvm-project/issues/102798
Previously, when selecting a Single Precision FPU, LLVM would ensure all
elements of the Candidate FPU matched the InputFPU that was given.
However, for cases such as Cortex-R52, there are FPU options where not
all fields match exactly, for example NEON Support or Restrictions on
the Registers available.
This change ensures that LLVM can select the FPU correctly, removing the
requirement for Neon Support and Restrictions for the Candidate FPU to
be the same as the InputFPU.
Before this change, we would set this to Clang's default of {64, 64}.
Now, we explicitly set it to {256, 64} which matches our ARM behavior
for ARMv8 targets and GCC's behavior for AArch64 targets.
Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of
512 bit options and disable m[no-]avx10.1 due to they were alias of 256
bit options.
We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to
disable both 256 and 512 instructions.
Since `__STDC_NO_THREADS__` is a reserved identifier,
- If `MSVC version < 17.9`
- C version < C11(201112L)
- When `<threads.h>` is unavailable `!__has_include(<threads.h>)` is
`__has_include` is defined.
Closes#115529
Currently, `__has_builtin` will return true when passed a builtin that
is only supported on the aux target. I found this when `__has_builtin`
was called with an X86 builtin but the current target was SPIR-V.
We should instead return false for aux builtins.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10305.
Note: No currently available Z system supports the arch15
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Upstream LLVM implicitly enables NEON for any ARMv7 target.
Many platform ABIs with an ARMv7 baseline also include NEON in that,
this is the case on e.g. Windows and iOS. On Linux, however, things are
not quite as clearly defined. Some distributions target an ARMv7
baseline without NEON available (this is the case for e.g. Debian/Ubuntu
for the "armhf" architecture).
To achieve this, Debian/Ubuntu patch LLVM downstream to make ARMv7 only
implicitly enable VPFv3-D16, not NEON - see [1].
That patch has the (unintended) effect that NEON no longer is available
by default for Windows/ARMv7 and iOS/ARMv7.
In practice, when compiling C for Windows/ARMv7 with Debian patched
Clang, NEON actually still is available, but not when compiling assembly
files. This is due to ARM::getARMCPUForArch (in
llvm/lib/TargetParser/ARMTargetParser.cpp) returning "cortex-a9" for
Windows. This difference, between C and assembly, is due to how
getARMTargetCPU is called in getARMTargetFeatures (in
clang/lib/Driver/ToolChains/Arch/ARM.cpp) to get defaults, only when
ForAS is not set - see [2].
There is an existing case in getARMTargetFeatures, for Android, which
explicitly enables NEON when targeting ARM >= v7. As Windows and iOS
have NEON as part of their ABI baseline just like Android does these
days (see [3] for where this default was added for Android), add the
implicit default in a similar way.
However, first do the general lookup of getARMTargetCPU (unless ForAS);
this makes sure that we get the same default CPU as before ("cortex-a9"
for Windows and "swift" for the "armv7s" architecture on Darwin).
[1] https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads
[2] b8baa2a913
[3] d0fbef9c75
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm,
which allows the sve-bitperm instructions to run in streaming mode.
It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The
sve2-bitperm feature is now an alias for sve-bitperm and sve2.
A new feature flag sve-bitperm is added to reflect the change that the
instructions under FEAT_SVE_BitPerm are supported if:
on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
arm-apple-none-macho uses DarwinTargetInfo which provides several Apple
specific macros. arm64-apple-none-macho however just uses the generic
AArch64leTargetInfo and doesn't get any of those macros. It's not clear
if everything from DarwinTargetInfo is desirable for
arm64-apple-none-macho, so make an AppleMachOTargetInfo to hold the
generic Apple macros and a few other basic things.
Embedded development often needs to use a different C standard library,
replacing the existing one normally passed as -internal-externc-isystem.
This works fine for an apple-macos target, but apple-none-macho doesn't
work because the MachO driver doesn't implement
AddClangSystemIncludeArgs to add the resource directory as
-internal-isystem like most other drivers do. Move most of the search
path logic from Darwin and DarwinClang down into an AppleMachO toolchain
between the MachO and Darwin toolchains.
Also define __MACH__ for apple-none-macho, as Swift expects all MachO
targets to have that defined.
Embedded development often needs to use a different C standard library,
replacing the existing one normally passed as -internal-externc-isystem.
This works fine for an apple-macos target, but apple-none-macho doesn't
work because the MachO driver doesn't implement
AddClangSystemIncludeArgs to add the resource directory as
-internal-isystem like most other drivers do. Move most of the search
path logic from Darwin and DarwinClang down into an AppleMachO toolchain
between the MachO and Darwin toolchains.
Also define \_\_MACH__ for apple-none-macho, as Swift expects all MachO
targets to have that defined.
This patch introduces support for the Hexagon V79 architecture. It
includes instruction formats, definitions, encodings, scheduling
classes, and builtins/intrinsics. It also adds missing Hexagon v73
builtins to clang.
This patch introduces support for the Hexagon V75 architecture. It
includes instruction formats, definitions, encodings, scheduling
classes, and builtins/intrinsics.
Version of SYCL was changed according to the latest agreement:
The lower 2 digits are not formally specified, but we plan to use these
to identify the month in which we submit the specification for
ratification, which is similar to the C++ macro __cplusplus.
Since the SYCL 2020 specification was submitted for ratification in
December of 2020, the macro's value is now 202012 for SYCL 2020.
see PR for details
https://github.com/KhronosGroup/SYCL-Docs/pull/634
The meaning of `__ARM_NEON_SVE_BRIDGE` was changed here:
https://github.com/ARM-software/acle/pull/362
Such that it should be defined to `1` if the `arm_neon_sve_bridge.h`
header file is available, which is the case for Clang.
Make the new diagnostic group a subgroup of the following diagnostic
groups:
-Wpre-c23-compat
-Wgnu-zero-variadic-macro-arguments
-Wc++20-extensions
-Wc23-extensions
This change is needed as 5231005193afb8db01afe9a8a1aa308d25f60ba1 made
it impossible to use -Wno-gnu-zero-variadic-macro-argumentsis to silence
the warning.
rdar://139234984
Two options for clang
-mdiv32: Use div.w[u] and mod.w[u] instructions with input not
sign-extended.
-mno-div32: Do not use div.w[u] and mod.w[u] instructions with input not
sign-extended.
The default is -mno-div32.
The standard mandates that this returns the width of the type, which is
the number of bits in the value. For bool, that's required to be `1`
explicitly.
Fixes#117348
We have defined `__riscv_cpu_model` variable in #101449. It contains
`mvendorid`, `marchid` and `mimpid` fields which are read via system
call `sys_riscv_hwprobe`.
We can support `__builtin_cpu_is` via comparing values in compiler's
CPU definitions and `__riscv_cpu_model`.
This depends on #116202.
Reviewers: lenary, BeMg, kito-cheng, preames, lukel97
Reviewed By: lenary
Pull Request: https://github.com/llvm/llvm-project/pull/116231
Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa