9642 Commits

Author SHA1 Message Date
Thurston Dang
928cad49be
Revert "[ubsan] Connect -fsanitize-skip-hot-cutoff to LowerAllowCheckPass<cutoffs>" (#125032)
Reverts llvm/llvm-project#124857 due to buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/46/builds/11310)
2025-01-29 22:03:05 -08:00
Thurston Dang
dccd271127
[ubsan] Connect -fsanitize-skip-hot-cutoff to LowerAllowCheckPass<cutoffs> (#124857)
This adds the plumbing between -fsanitize-skip-hot-cutoff (introduced in
https://github.com/llvm/llvm-project/pull/121619) and
LowerAllowCheckPass<cutoffs> (introduced in
https://github.com/llvm/llvm-project/pull/124211).
    
The net effect is that -fsanitize-skip-hot-cutoff now combines the
functionality of -ubsan-guard-checks and
-lower-allow-check-percentile-cutoff (though this patch does not remove
those yet), and generalizes the latter to allow per-sanitizer cutoffs.
    
Note: this patch replaces Intrinsic::allow_ubsan_check's
SanitizerHandler parameter with SanitizerOrdinal; this is necessary
because the hot cutoffs are specified in terms of SanitizerOrdinal
(e.g., null, alignment), not SanitizerHandler (e.g., TypeMismatch).
Likewise, CodeGenFunction::EmitCheck is changed to emit
allow_ubsan_check() for each individual check.

---------

Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2025-01-29 21:03:26 -08:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Oliver Stannard
e9c2e0acd7
[AArch64] Match GCC behaviour for zero-size structs (#124760)
We had a test claiming that this empty struct type consumes a register
slot when passing it to a function with GCC, but that does not appear to
be the case, at least with GCC versions going back to 4.8.

This also caused a miscompilation when passing one of these structs to a
variadic function, but it turned out that our implementation of `va_arg`
matches GCC's ABI, so the one change fixes both bugs.
2025-01-29 15:02:37 +00:00
Fraser Cormack
1ac3665e66
[clang] Restrict the use of scalar types in vector builtins (#119423)
This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.

Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.

With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.

This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.

Fixes #84047.

RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
2025-01-29 09:40:04 +00:00
Stephen Tozer
548ecde428 Add extra explicit triple to fix errors from #110102
Attempted fix for errors observed on:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8724466517233166081/overview

Following the previous fixes, one test that needed an explicit triple
was missed; this commit adds that triple.
2025-01-28 22:51:30 +00:00
Artem Belevich
310f55875f
[CUDA] Make target intrinsics work with ptx 8.7 (#124818)
Fixes build break with CUDA-12.8 introduced in #123398
2025-01-28 12:14:46 -08:00
Stephen Tozer
de9b0ddedc Add explicit triple to fix errors from #110102
Attempted fix for errors observed on:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8724497761651154417/overview

Several tests that required an explicit triple had none present; this
patch adds those triples.
2025-01-28 19:34:41 +00:00
Stephen Tozer
822f74a911
[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767)
This patch contains a number of changes relating to the above flag;
primarily it updates comment references to the old flag names,
"-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names,
"-fextend-variable-liveness[={all,this}]". These changes are all NFC.

This patch also removes the explicit -fextend-this-ptr-liveness flag
alias, and shortens the help-text for the main flag; these are both
changes that were meant to be applied in the initial PR (#110000), but
due to some user-error on my part they were not included in the merged
commit.
2025-01-28 18:25:32 +00:00
Thurston Dang
ef92e6b99f
[BoundsChecking] Update ubsantrap to use GuardKind (#124613)
This change makes it consistent with other uses of ubsantrap.

This also updates the tests. Notably, BoundsChecking/runtimes.ll had guard=3 which passed only because the method of calculating the parameter (`IRB.GetInsertBlock()->getParent()->size()`) happened to give the same answer.
2025-01-28 08:52:31 -08:00
Stephen Tozer
8ad9e1ecb7 [Clang] Fix use of deprecated method and missing triple
Fixes two buildbot errors caused by 4424c44c (#110102):

The first error, seen on some sanitizer bots:
https://lab.llvm.org/buildbot/#/builders/51/builds/9901

The initial commit used the deprecated getDeclaration intrinsic instead
of the non-deprecated getOrInsert- equivalent. This patch trivially
updates the code in question to use the new intrinsic.

The second error, seen on the clang-armv8-quick bot:
https://lab.llvm.org/buildbot/#/builders/154/builds/10983

One of the tests depends on a particular triple to get the exact output
expected by the test, but did not specify this triple; this patch adds
the triple in question.
2025-01-28 13:21:41 +00:00
Wolfgang Pieb
4424c44c8c [Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102)
Following the previous patch which adds the "extend lifetimes" flag
without (almost) any functionality, this patch adds the real feature by
allowing Clang to emit fake uses. These are emitted as a new form of cleanup,
set for variable addresses, which just emits a fake use intrinsic when the
variable falls out of scope. The code for achieving this is simple, with most
of the logic centered on determining whether to emit a fake use for a given
address, and on ensuring that fake uses are ignored in a few cases.

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2025-01-28 12:30:31 +00:00
Momchil Velikov
db6fa74dfe
[AArch64] Implement FP8 Neon reinterpret intrinsics (#120476) 2025-01-28 11:06:24 +00:00
Nikita Popov
1295aa2e81
[Clang] Add -fwrapv-pointer flag (#122486)
GCC supports three flags related to overflow behavior:
 * `-fwrapv`: Makes signed integer overflow well-defined.
 * `-fwrapv-pointer`: Makes pointer overflow well-defined.
* `-fno-strict-overflow`: Implies `-fwrapv -fwrapv-pointer`, making both
signed integer overflow and pointer overflow well-defined.

Clang currently only supports `-fno-strict-overflow` and `-fwrapv`, but
not `-fwrapv-pointer`.

This PR proposes to introduce `-fwrapv-pointer` and adjust the semantics
of `-fwrapv` to match GCC.

This allows signed integer overflow and pointer overflow to be
controlled independently, while `-fno-strict-overflow` still exists to
control both at the same time (and that option is consistent across GCC
and Clang).
2025-01-28 09:57:00 +01:00
Chandler Carruth
b968fd9502
[StrTable] Mechanically convert NVPTX builtins to use TableGen (#122873)
This switches them to use tho common TableGen layer, extending it to
support the missing features needed by the NVPTX backend.

The biggest thing was to build a TableGen system that computes the
cumulative SM and PTX feature sets the same way the macros did. That's
done with some string concatenation tricks in TableGen, but they worked
out pretty neatly and are very comparable in complexity to the macro
version.

Then the actual defines were mapped over using a very hacky Python
script. It was never productionized or intended to work in the future,
but for posterity:

https://gist.github.com/chandlerc/10bdf8fb1312e252b4a501bace184b66

Last but not least, there was a very odd "bug" in one of the converted
builtins' prototype in the TableGen model: it didn't handle uses of `Z`
and `U` both as *qualifiers* of a single type, treating `Z` as its own
`int32_t` type. So my hacky Python script converted `ZUi` into two
types, an `int32_t` and an `unsigned int`. This produced a very wrong
prototype. But the tests caught this nicely and I fixed it manually
rather than trying to improve the Python script as it occurred in
exactly one place I could find.

This should provide direct benefits of allowing future refactorings to
more directly leverage TableGen to express builtins more structurally
rather than textually. It will also make my efforts to move builtins to
string tables significantly more effective for the NVPTX backend where
the X-macro approach resulted in *significantly* less efficient string
tables than other targets due to the long repeated feature strings.
2025-01-27 22:45:37 -08:00
Momchil Velikov
f75860f895
[AArch64] Implement NEON FP8 intrinsics for fused multiply-add (#123615)
This patch adds the following intrinsics:

* Fused multiply-add non-indexed

float16x8_t vmlalbq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t,
fpm_t)
float16x8_t vmlaltq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t,
fpm_t)
        
float32x4_t vmlallbbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlallbtq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlalltbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlallttq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)

* Floating-point multiply-add long to half-precision (vector, by
element)

float16x8_t vmlalbq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlalbq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlaltq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlaltq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
    
* Floating-point multiply-add long-long to single-precision (vector, by
element)

float32x4_t vmlallbbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbtq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbtq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlalltbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlalltbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallttq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallttq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
2025-01-28 00:38:44 +00:00
Momchil Velikov
804b81d39f
[AArch64] Add FP8 Neon intrinsics for dot-product (#123613)
This patch adds the following intrinsics:

float16x4_t vdot_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn, mfloat8x8_t
vm, fpm_t fpm)
float16x8_t vdotq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, fpm_t fpm)
    
float16x4_t vdot_lane_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x4_t vdot_laneq_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vdotq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vdotq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
    
float32x2_t vdot_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn, mfloat8x8_t
vm, fpm_t fpm)
float32x4_t vdotq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, fpm_t fpm)

float32x2_t vdot_lane_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x2_t vdot_laneq_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vdotq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vdotq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
2025-01-27 21:14:16 +00:00
Momchil Velikov
99bd2e3f12
[AArch64] Add Neon FP8 conversion intrinsics (#123612)
The patch adds the following intrinsics:

    bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm)
mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn,
float32x4_t vm, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm)
mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t
fpm)

Co-Authored-By: Caroline Concatto <caroline.concatto@arm.com>
2025-01-27 17:32:47 +00:00
Momchil Velikov
87103a016f
[AArch64] Implement NEON FP8 vectors as VectorType (#123603)
Reimplement Neon FP8 vector types using attribute `neon_vector_type`
instead of having them as builtin types.
This allows to implement FP8 Neon intrinsics without the need to add
special cases for these types when using `__builtin_shufflevector`
or bitcast (using C-style cast operator) between vectors, both
extensively used in the generated code in `arm_neon.h`.
2025-01-27 10:41:53 +00:00
Alexandros Lamprineas
474f5d2aef
[FMV][AArch64] Remove features predres and ls64. (#124266)
These cannot be detected by reading the ID_AA64ISAR1_EL1 register since
their corresponding bitfields are hidden. Additionally the instructions
that these features enable are unusable from EL0.

ACLE: https://github.com/ARM-software/acle/pull/382
2025-01-24 17:22:27 +00:00
Phoebe Wang
ee2722fc88
[X86][AVX10.2-BF16] Remove [NE]P from intrinsic and instruction name (#123335)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2025-01-24 15:49:28 +08:00
Phoebe Wang
24f177df61
[X86][AVX10.2-BF16] Update VCOMISBF16 intrinsics and instructions (#123307)
- Add `I` to intrinsics and instructions
- Add `_` before sbf16 in intrinsics

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2025-01-24 08:37:29 +08:00
Sam Elliott
33c4407471
[RISCV] Support cR Inline Asm Constraint (#124174)
This denotes RVC-compatible GPR Pairs, which are used by the Zclsd
extension.

C API PR: riscv-non-isa/riscv-c-api-doc#102
2025-01-23 16:19:19 -08:00
Alexandros Lamprineas
5a7d92f7a0
[NFC] Remove invalid features from test and autogenerate checks. (#124130) 2025-01-23 17:55:03 +00:00
Mikołaj Piróg
25653e558c
[AVX10.2] Update convert chapter intrinsic and mnemonics names (#123656)
Intel spec for avx10.2
(https://cdrdv2.intel.com/v1/dl/getContent/828965) has been updated.
This PR changes relevant names from the "AVX10 CONVERT INSTRUCTIONS"
chapter .
2025-01-23 22:23:56 +08:00
Phoebe Wang
4f40b07533
[X86][AVX10.2-SATCVT][NFC] Remove NE from intrinsic and instruction name (#123275)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2025-01-22 22:53:47 +08:00
Oliver Stannard
c4ef805b0b
[Clang] Re-write codegen for atomic_test_and_set and atomic_clear (#121943)
Re-write the sema and codegen for the atomic_test_and_set and
atomic_clear builtin functions to go via AtomicExpr, like the other
atomic builtins do. This simplifies the code, because AtomicExpr already
handles things like generating code for to dynamically select the memory
ordering, which was duplicated for these builtins. This also fixes a few
crash bugs, one when passing an integer to the pointer argument, and one
when using an array.

This also adds diagnostics for the memory orderings which are not valid
for atomic_clear according to
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which
were missing before.

Fixes https://github.com/llvm/llvm-project/issues/111293.

This is a re-land of #120449, modified to allow any non-const pointer
type for the first argument.
2025-01-22 10:48:04 +00:00
Jonathan Thackray
d028eaaeb8
[AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585)
Update the following intrinsics to have FP8 variants:
``` c
svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx);
svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm);
svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm);
svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm);
svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm);
```
2025-01-21 13:34:57 +00:00
Phoebe Wang
13c6abfac8
[X86][AVX10.2-MINMAX][NFC] Remove NE[P] from intrinsic and instruction (#123272)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2025-01-21 19:55:09 +08:00
Oliver Stannard
84fa1755a5
[AArch64] FEAT_SPEv1p2 is optional in v8.7-A and v9.2-A (#123336)
The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef)
was incorrectly marked as a required feature of Armv8.7-A (and later),
which is incorrect because it is optional, and some CPUs do not
implement it. This moves it to the default features list, so that it is
still enabled by -march=armv8.7-a, but can be configured individually
for each processor.

For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of
the FEAT_SPE* features, so updated the tests accordingly. All other
Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For
Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've
removed it from their tests. For Apple M4, I haven't found any reference for
whether that CPU should have this feature, so I've added it to the CPU
definition to avoid this being a functional change.
2025-01-21 10:12:36 +00:00
David Green
547bfda56b
[AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (#120363)
This started out as trying to combine bf16 fpround to BFCVT2
instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in
favour of generating fpround instructions directly. This simplifies the
patterns and can lead to other optimizations. The BFCVT2 instruction is
adjusted to makes sure the types are valid, and a bfcvt2 is now
generated in more place. The old intrinsics are auto-upgraded to fptrunc
instructions too.
2025-01-21 09:16:04 +00:00
Ulrich Weigand
8424bf207e [SystemZ] Add support for new cpu architecture - arch15
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10305.

Note: No currently available Z system supports the arch15
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2025-01-20 19:30:21 +01:00
Nikita Popov
a79ae862ab [Clang] Regenerate test checks (NFC)
To reduce diffs in an upcoming change.
2025-01-20 12:46:37 +01:00
Nikita Popov
a4d9a8de08 [Clang] Don't match irrelevant attributes in mips return tests (NFC)
The only thing these tests care about from an ABI perspective is sret,
don't also test all the optimization attributes.
2025-01-20 12:41:02 +01:00
Nikita Popov
bd96295e09 [Clang] Use more liberal pointer attribute wildcard in ms-intrinsics tests (NFC)
Allow arbitrary attributes, including those with arguments.
2025-01-20 12:38:38 +01:00
Hervé Poussineau
71d6287f5b
[Clang][MIPS] Create correct linker arguments for Windows toolchains (#121041) 2025-01-20 15:11:26 +08:00
Sirraide
12f78e740c
[Clang] [NFC] Fix unintended -Wreturn-type warnings everywhere in the test suite (#123464)
In preparation of making `-Wreturn-type` default to an error (as there
is virtually no situation where you’d *want* to fall off the end of a
function that is supposed to return a value), this patch fixes tests
that have relied on this being only a warning, of which there seem 
to be 3 kinds:

1. Tests which for no apparent reason have a function that triggers the
warning.

I suspect that a lot of these were on accident (or from before the
warning was introduced), since a lot of people will open issues w/ their
problematic code in the `main` function (which is the one case where you
don’t need to return from a non-void function, after all...), which
someone will then copy, possibly into a namespace, possibly renaming it,
the end result of that being that you end up w/ something that
definitely is not `main` anymore, but which still is declared as
returning `int`, and which still has no return statement (another reason
why I think this might apply to a lot of these is because usually the
actual return type of such problematic functions is quite literally
`int`).
  
A lot of these are really old tests that don’t use `-verify`, which is
why no-one noticed or had to care about the extra warning that was
already being emitted by them until now.

2. Tests which test either `-Wreturn-type`, `[[noreturn]]`, or what
codegen and sanitisers do whenever you do fall off the end of a
function.

3. Tests where I struggle to figure out what is even being tested
(usually because they’re Objective-C tests, and I don’t know
Objective-C), whether falling off the end of a function matters in the
first place, and tests where actually spelling out an expression to
return would be rather cumbersome (e.g. matrix types currently don’t
support list initialisation, so I can’t write e.g. `return {}`).

For tests that fall into categories 2 and 3, I just added
`-Wno-error=return-type` to the `RUN` lines and called it a day. This
was especially necessary for the former since `-Wreturn-type` is an
analysis-based warning, meaning that it is currently impossible to test
for more than one occurrence of it in the same compilation if it
defaults to an error since the analysis pass is skipped for subsequent
functions as soon as an error is emitted.

I’ve also added `-Werror=return-type` to a few tests that I had already
updated as this patch was previously already making the warning an error
by default, but we’ve decided to split that into two patches instead.
2025-01-18 19:16:33 +01:00
Phoebe Wang
48803bc8c7
[X86][AMX-AVX512][NFC] Remove P from intrinsic and instruction name (#123270)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2025-01-17 22:21:19 +08:00
CarolineConcatto
66d347b46f
[Clang][AArch64]Make Tuple Size Optional for svluti4_lane Intrinsics (#123197)
The svluti4_lane intrinsic currently requires the tuple size to be
specified in the intrinsic name when using a tuple type input.

According to the ACLE specification, the svluti4_lane intrinsic with a
tuple type input, such as:

svint16_t svluti4_lane[_s16_x2(svint16x2_t table, svuint8_t indices,
uint64_t imm_idx);

should allow the tuple size of the input type to be optional.
2025-01-16 16:13:55 +00:00
Momchil Velikov
16e45b8fac
[AArch64] Implement FP8 SVE/SME reinterpret intrinsics (#121063) 2025-01-13 18:53:07 +00:00
Vitaly Buka
409ca49feb
[ubsan] Pass fsanitize-skip-hot-cutoff into -fsanitize=bounds (#122576) 2025-01-13 09:55:44 -08:00
CarolineConcatto
9256485043
[Clang][LLVM][AArch64]Add new feature SSVE-BitPerm (#121947)
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm,
which allows the sve-bitperm instructions to run in streaming mode.

It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The
sve2-bitperm feature is now an alias for sve-bitperm and sve2.

A new feature flag sve-bitperm is added to reflect the change that the
instructions under FEAT_SVE_BitPerm are supported if:
 on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
 in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
2025-01-13 16:34:33 +00:00
Lukacma
7ed451a3f3
[AArch64] Change feature dependencies of fp8 features (#122280)
This patch simplifies feature dependencies of FP8 features and also adds
new tests to check these.
2025-01-13 13:44:15 +00:00
Sander de Smalen
b4ce29ab31
[AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788)
This adds support for parsing the attribute and codegen to map it to
"aarch64_za_state_agnostic" LLVM IR attribute.

This attribute is described in the Arm C Language Extensions (ACLE)
document:

  https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic
2025-01-12 21:35:44 +00:00
Vitaly Buka
8af4d206e0
[NFCI][BoundsChecking] Apply nosanitize on local-bounds instrumentation (#122416)
Should be NFCI as we run sanitizer, like msan, before local-bounds.
2025-01-10 18:11:19 -08:00
Vitaly Buka
99d0780f05
[nfc][ubsan] Add local-bounds test (#122415)
Show that @llvm.allow.ubsan.check is not used yet.
2025-01-10 17:47:35 -08:00
Vitaly Buka
834d65eb2e
[nfc][ubsan] Use O1 in test to remove more unrelated stuff (#122408) 2025-01-10 16:41:43 -08:00
Alexandros Lamprineas
b93ffa8e4a
[FMV][AArch64] Changes in fmv-features metadata. (#122192)
* We want the default version to have this attribute too otherwise it
becomes indistinguishable from non-versioned functions.

* We don't need the '+' unlike target-features which can negate. This
will allow using the parsing API of target_version/clones for the
metadata too.
2025-01-10 17:50:35 +00:00
Evgenii Kudriashov
b43c97c2dd
[Headers][X86] amxintrin.h - fix attributes according to Intel SDM (#122204)
`tileloadd`, `tileloaddt1` and `tilestored` are part of `amx-tile`
feature.

The problem is observed if `__tile_loadd` intrinsic is invoked,
`_tile_loadd_internal` requiring `amx-int8` is inlined into
`__tile_loadd` that has only `amx-tile`.
2025-01-10 17:52:09 +01:00
Vitaly Buka
a4394d9d42 [NFC][ubsan] Rename prefixes in test
Looks like update_cc_test_checks is being confused if
it creates vars with the name matching prefix.

Issue triggered with #122415
2025-01-09 21:12:36 -08:00