llvm-project

Author	SHA1	Message	Date
Thurston Dang	d33a2c5811	[BoundsSan] Update BoundsChecking.cpp to use no-merge attribute where applicable (#120620 ) https://github.com/llvm/llvm-project/pull/65972 introduced -ubsan-unique-traps and -bounds-checking-unique-traps, which attach the function size to the ubsantrap intrinsic. https://github.com/llvm/llvm-project/pull/117651 changed ubsan-unique-traps to use nomerge instead of the function size, but did not update -bounds-checking-unique-traps. This patch adds nomerge to bounds-checking-unique-traps.	2024-12-19 13:31:29 -08:00
Florian Hahn	9e322c56f7	[TySan] Don't report globals with external storage. (#120565 ) Globals with external storage should have been initialized where they are defined. Fixes https://github.com/llvm/llvm-project/issues/120448 PR: https://github.com/llvm/llvm-project/pull/120565	2024-12-19 21:30:56 +00:00
Thurston Dang	cb8a90b7d1	[ubsan] Remove -ubsan-unique-traps (replace with -fno-sanitize-merge) (#120613 ) -fno-sanitize-merge (introduced in https://github.com/llvm/llvm-project/pull/120511) duplicates the functionality of -ubsan-unique-traps but also allows individual checks to be specified e.g., * "-fno-sanitize-merge" without arguments is equivalent to -ubsan-unique-traps * "-fno-sanitize-merge=bool,enum" will apply it only to those two checks Additionally, the naming is more consistent with the rest of the -fsanitize- family. This patch therefore removes -ubsan-unique-traps. This breaks backwards compatibility; we hope that this is acceptable since '-mllvm -ubsan-unique-traps' was an experimental flag. This patch also adds negative test examples to bounds-checking.c, and strengthens the NOOPTARRAY assertion to prevent spurious matches. "-bounds-checking-unique-traps" is unaffected by this patch.	2024-12-19 12:53:48 -08:00
SpencerAbson	9469fd24b9	[Clang][AArch64] Remove const from base pointers in sve2p1 stores (#120551 ) This patch removes the const qualifier from the base pointer argument of `svst1wq`/`svst1wq_vnum` and `svst1dq`/`svst1dq_vnum`, in accordance with https://github.com/ARM-software/acle/pull/359.	2024-12-19 14:13:02 +00:00
SpencerAbson	db84ae3a68	[Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549 ) This patch adds signed offset/index variants to the SVE2p1 quadword store intrinsics, in accordance with https://github.com/ARM-software/acle/pull/359.	2024-12-19 13:27:07 +00:00
Alexandros Lamprineas	6586c676b4	[FMV][AArch64] Emit mangled default version if explicitly specified. (#120022 ) Currently we need at least one more version other than the default to trigger FMV. However we would like a header file declaration __attribute__((target_version("default"))) void f(void); to guarantee that there will be f.default	2024-12-19 12:06:46 +00:00
Oliver Stannard	9fc2fadbfc	[Clang] Re-write codegen for atomic_test_and_set and atomic_clear (#120449 ) Re-write the sema and codegen for the atomic_test_and_set and atomic_clear builtin functions to go via AtomicExpr, like the other atomic builtins do. This simplifies the code, because AtomicExpr already handles things like generating code for to dynamically select the memory ordering, which was duplicated for these builtins. This also fixes a few crash bugs, one when passing an integer to the pointer argument, and one when using an array. This also adds diagnostics for the memory orderings which are not valid for atomic_clear according to https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which were missing before. Fixes #111293.	2024-12-19 09:12:19 +00:00
Thurston Dang	ffff7bb582	Reapply "[ubsan] Add -fsanitize-merge (and -fno-sanitize-merge) (#120…464)" (#120511 ) This reverts commit 2691b964150c77a9e6967423383ad14a7693095e. This reapply fixes the buildbot breakage of the original patch, by updating clang/test/CodeGen/ubsan-trap-debugloc.c to specify -fsanitize-merge (the default, which is merge, is applied by the driver but not clang_cc1). This reapply also expands clang/test/CodeGen/ubsan-trap-merge.c. ---- Original commit message: '-mllvm -ubsan-unique-traps' (https://github.com/llvm/llvm-project/pull/65972) applies to all UBSan checks. This patch introduces -fsanitize-merge (defaults to on, maintaining the status quo behavior) and -fno-sanitize-merge (equivalent to '-mllvm -ubsan-unique-traps'), with the option to selectively applying non-merged handlers to a subset of UBSan checks (e.g., -fno-sanitize-merge=bool,enum). N.B. we do not use "trap" in the argument name since https://github.com/llvm/llvm-project/pull/119302 has generalized -ubsan-unique-traps to work for non-trap modes (min-rt and regular rt). This patch does not remove the -ubsan-unique-traps flag; that will override -f(no-)sanitize-merge.	2024-12-18 18:13:26 -08:00
Thurston Dang	2691b96415	Revert "[ubsan] Add -fsanitize-merge (and -fno-sanitize-merge) (#120464 )" This reverts commit 7eaf4708098c216bf432fc7e0bc79c3771e793a4. Reason: buildbot breakage (e.g., https://lab.llvm.org/buildbot/#/builders/144/builds/14299/steps/6/logs/FAIL__Clang__ubsan-trap-debugloc_c)	2024-12-18 23:50:01 +00:00
Thurston Dang	7eaf470809	[ubsan] Add -fsanitize-merge (and -fno-sanitize-merge) (#120464 ) '-mllvm -ubsan-unique-traps' (https://github.com/llvm/llvm-project/pull/65972) applies to all UBSan checks. This patch introduces -fsanitize-merge (defaults to on, maintaining the status quo behavior) and -fno-sanitize-merge (equivalent to '-mllvm -ubsan-unique-traps'), with the option to selectively applying non-merged handlers to a subset of UBSan checks (e.g., -fno-sanitize-merge=bool,enum). N.B. we do not use "trap" in the argument name since https://github.com/llvm/llvm-project/pull/119302 has generalized -ubsan-unique-traps to work for non-trap modes (min-rt and regular rt). This patch does not remove the -ubsan-unique-traps flag; that will override -f(no-)sanitize-merge.	2024-12-18 15:36:12 -08:00
Alexander Kornienko	23a239267e	Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460 ) Reverts llvm/llvm-project#119225 due to the lack of sanitizer support, large potential of breaking code containing latent UB, non-trivial localization and investigation, and what seems to be a bad interaction with msan (a test is in the works). Related discussions: https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822 https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255	2024-12-18 19:06:34 +01:00
Florian Hahn	c135f6ffe2	[TySan] Add initial Type Sanitizer support to Clang) (#76260 ) This patch introduces the Clang components of type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. The Clang changes are mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0. It goes together with the corresponding LLVM changes (https://github.com/llvm/llvm-project/pull/76259) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76260	2024-12-17 15:13:42 +00:00
SpencerAbson	908e30658d	[AArch64] Implement intrinsics for FP8 SME FMLAL/FMLALL (multi) (#119546 ) This patch implements the following intrinsics: Multi-vector 8-bit floating-point multiply-add long (multiple vectors). ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with https://github.com/ARM-software/acle/pull/323	2024-12-17 11:47:20 +00:00
SpencerAbson	9c89b40f18	[AArch64] Implement intrinsics for FMLAL/FMLALL (single) (#119568 ) Multi-vector 8-bit floating-point multiply-add long (single) ```c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla[_single]_za16[_mf8]_vg2x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla[_single]_za32[_mf8]_vg4x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with https://github.com/ARM-software/acle/pull/323. Co-authored-by: Momchil Velikov momchil.velikov@arm.com	2024-12-17 09:31:54 +00:00
Florian Mayer	514580b438	[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918 ) This makes sure no optimizations are applied that assume the bigger alignment or size, which could be incorrect if we link together with non-instrumented code.	2024-12-17 00:47:02 -08:00
SpencerAbson	38099d0608	[AArch64] Implement intrinsics for SME FP8 FMLAL/FMLALL (Indexed) (#118549 ) This patch implements the following intrinsics: Multi-vector 8-bit floating-point multiply-add long. ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla_lane_za16[_mf8]_vg2x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_lane_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_lane_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla_lane_za32[_mf8]_vg4x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); void svmla_lane_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); void svmla_lane_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); ``` In accordance with: https://github.com/ARM-software/acle/pull/323	2024-12-16 21:45:38 +00:00
Jonathan Thackray	8380bafaed	[AArch64] Add intrinsics for SME FP8 FVDOT, FVDOTB and FVDOTT intrinsics (#119922 ) Add support for the following SME 8 bit floating-point dot-product intrinsics: ``` // Only if __ARM_FEATURE_SME_F8F16 != 0 void svvdot_lane_za16[_mf8]_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svvdott_lane_za32[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); void svvdotb_lane_za32[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` --------- Co-authored-by: Momchil Velikov <momchil.velikov@arm.com> Co-authored-by: Marian Lukac <marian.lukac@arm.com>	2024-12-16 14:42:45 +00:00
Jonathan Thackray	ef4b597015	[AArch64] Add intrinsics for SME FP8 FDOT single and multi instructions (#119845 ) Add support for the following SME 8 bit floating-point dot-product intrinsics: ``` // Only if __ARM_FEATURE_SME_F8F16 != 0 void svdot[_single]_za16[_mf8]_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot[_single]_za16[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot_za16[_mf8]_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot_za16[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svdot[_single]_za32[_mf8]_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot[_single]_za32[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot_za32[_mf8]_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svdot_za32[_mf8]_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` These intrinsics are extracted from: https://github.com/ARM-software/acle/pull/323/ Co-authored-by: Momchil Velikov <momchil.velikov@arm.com> Co-authored-by: Marian Lukac <marian.lukac@arm.com>	2024-12-16 13:14:40 +00:00
Daniil Kovalev	f65a21a4ec	[PAC][ELF][AArch64] Support signed personality function pointer (#119361 ) Re-apply #113148 after revert in #119331 If function pointer signing is enabled, sign personality function pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key, 0x7EAD = `ptrauth_string_discriminator("personality")` constant discriminator and address diversity enabled.	2024-12-16 10:24:09 +03:00
Momchil Velikov	2eed88da6a	[AArch64] Implement FP8 SVE intrinsics for fused multiply-add (#118126 ) This patch adds the following intrinsics: * 8-bit floating-point multiply-add long to half-precision (bottom). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat16_t svmlalb[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat16_t svmlalb[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long to half-precision (bottom, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat16_t svmlalb_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm); * 8-bit floating-point multiply-add long to half-precision (top). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat16_t svmlalt[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat16_t svmlalt[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long to half-precision (top, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat16_t svmlalt_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (bottom bottom). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlallbb[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svmlallbb[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (bottom bottom, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlallbb_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (bottom top). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlallbt[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svmlallbt[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (bottom top, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlallbt_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (top bottom). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlalltb[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svmlalltb[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (top bottom, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlalltb_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (top top). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlalltt[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svmlalltt[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point multiply-add long long to single-precision (top top, indexed). // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8FMA) \|\| __ARM_FEATURE_SSVE_FP8FMA svfloat32_t svmlalltt_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_15, fpm_t fpm);	2024-12-13 21:05:27 +00:00
Momchil Velikov	c2172431c7	[AArch64] Implements FP8 SVE intrinsics for dot-product (#118125 ) This patch adds the following intrinsics: * 8-bit floating-point dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svdot[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_3, fpm_t fpm); * 8-bit floating-point dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat16_t svdot[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_7, fpm_t fpm);	2024-12-13 14:06:54 +00:00
Nikita Popov	a30e50fcb3	[BasicAA] Do not decompose past casts with different index width (#119365 ) BasicAA currently tries to support addrspacecasts that change the index width by performing the decomposition in the maximum of all index widths and then trying to fix this up with in-place sign extends to get correct overflow behavior if the actual index width is smaller. However, even in the case where we don't mix different index widths and just have an index width that is smaller than the maximum, the behavior is incorrect (see test), because we only perform the index width adjustment during decomposition and not any of the later logic -- and we don't do anything at all for variable offsets. I'm sure that the case where we actually mix different index widths is even more broken than that. Fix this by not allowing decomposition through index width changes. If the pointers have different index widths, fall back to a base object comparison, ignoring the offsets.	2024-12-13 12:58:59 +01:00
Jonathan Thackray	1fd3d1d04e	[AArch64] Add intrinsics for SME FP8 FDOT LANE instructions (#118492 ) Add support for the following SME 8 bit floating-point dot-product intrinsics: * void svdot_lane_za16_mf8_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm); * void svdot_lane_za16_mf8_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm); * void svdot_lane_za32_mf8_vg1x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm); * void svdot_lane_za32_mf8_vg1x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm); --------- Co-authored-by: Momchil Velikov <momchil.velikov@arm.com> Co-authored-by: Marian Lukac <marian.lukac@arm.com> Co-authored-by: Caroline Concatto <caroline.concatto@arm.com> Co-authored-by: SpencerAbson <Spencer.Abson@arm.com>	2024-12-13 09:09:36 +00:00
Alexandros Lamprineas	6f013dbced	[AArch64][FMV] Add missing feature dependencies and detect at runtime. (#119231 ) i8mm -> simd fp16fml -> simd frintts -> fp bf16 -> simd sme -> fp16 Approved in ACLE as https://github.com/ARM-software/acle/pull/368	2024-12-11 22:11:32 +00:00
Momchil Velikov	b1d8c60dd4	[AArch64] Implement FP8 SVE Intrinsics for narrowing conversions (#118124 ) This patch adds the following instrinsics: * Half-precision and BFloat16 convert, narrow, and interleave to 8-bit floating-point. // Variant is also available for: _bf16_x2 svmfloat8_t svcvtn_mf8[_f16_x2]_fpm(svfloat16x2_t zn, fpm_t fpm); * Single-precision convert, narrow, and interleave to 8-bit floating-point (top and bottom). svmfloat8_t svcvtnt_mf8[_f32_x2]_fpm(svmfloat8_t zd, svfloat32x2_t zn, fpm_t fpm); svmfloat8_t svcvtnb_mf8[_f32_x2]_fpm(svfloat32x2_t zn, fpm_t fpm);	2024-12-11 13:37:15 +00:00
SpencerAbson	b0763a472b	[AArch64] Implement intrinsics for FP8 FCVT/FCVTN/BFCVT (#118025 ) This patch implements the following intrinsics: Convert to packed 8-bit floating-point format. ``` c // Variants are also available for: _mf8[_bf16_x2] and _mf8[_f32_x4] svmfloat8_t svcvt_mf8[_f16_x2]_fpm(svfloat16x2_t zn, fpm_t fpm) __arm_streaming; ``` Convert to interleaved 8-bit floating-point format. ``` c svmfloat8_t svcvtn_mf8[_f32_x4]_fpm(svfloat32x4_t zn, fpm_t fpm) __arm_streaming; ``` In accordance with https://github.com/ARM-software/acle/pull/323. Co-authored-by: Marin Lukac marian.lukac@arm.com Co-authored-by: Caroline Concatto caroline.concatto@arm.com	2024-12-11 09:17:43 +00:00
Thurston Dang	67bd04facf	[ubsan] Don't merge non-trap handlers if -ubsan-unique-traps or not optimized (#119302 ) UBSan handler calls are sometimes merged by the backend, which complicates debugging. Merging is currently disabled for UBSan traps if -ubsan-unique-traps is specified or if optimization is disabled. This patch applies the same policy to non-trap handler calls. N.B. "-ubsan-unique-traps" becomes somewhat of a misnomer since it will now apply to non-trap handler calls as well as traps; nonetheless, we keep the naming for backwards compatibility.	2024-12-10 15:25:24 -08:00
anoopkg6	dc04d414df	SystemZ: Add support for __builtin_setjmp and __builtin_longjmp. (#119257 ) This pr includes fixes for original pr##116642. Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ..	2024-12-10 19:50:51 +01:00
Dan Gohman	c5ab70c508	[WebAssembly] Add `-i128:128` to the `datalayout` string. (#119204 ) Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM `datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is currently using the defaults for both, so it's inconsistent. Fix this by adding `-i128:128` to Wasm's `datalayout` string so that it aligns `i128` to 16 bytes too. This is similar to [llvm/llvm-project@dbad963](`dbad963a69`) for SPARC. This fixes rust-lang/rust#133991; see that issue for further discussion. [defaults to aligning `__int128_t` to 16 bytes]: `f8b4182f07/clang/lib/Basic/TargetInfo.cpp (L77)` [default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout	2024-12-10 09:21:58 -08:00
Pedro Lobo	f28e52274c	[Clang] Change two placeholders from `undef` to `poison` [NFC] (#119141 ) - Use `poison` instead of `undef` as a phi operand for an unreachable path (the predecessor will not go the BB that uses the value of the phi). - Call `@llvm.vector.insert` with a `poison` subvec when performing a `bitcast` from a fixed vector to a scalable vector.	2024-12-10 15:57:55 +00:00
Momchil Velikov	cc1a2ea61e	[AArch64] Implement FP8 SVE intrinsics for widening conversions (#118123 ) This patch adds the following intrinsics: * 8-bit floating-point convert to half-precision and BFloat16. // Variants are also available for: _bf16 svfloat16_t svcvt1_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); svfloat16_t svcvt2_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); * 8-bit floating-point convert to half-precision and BFloat16 (top). // Variants are also available for: _bf16 svfloat16_t svcvtlt1_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); svfloat16_t svcvtlt2_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm);	2024-12-10 13:32:05 +00:00
Nikita Popov	e21ab4d16b	[InstCombine] Infer nuw for gep inbounds from base of object (#119225 ) When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138).	2024-12-10 10:00:50 +01:00
Daniil Kovalev	ef2e590e7b	Revert "[PAC][ELF][AArch64] Support signed personality function pointer" (#119331 ) Reverts llvm/llvm-project#113148 See buildbot failure https://lab.llvm.org/buildbot/#/builders/190/builds/11048	2024-12-10 09:12:25 +03:00
Daniil Kovalev	4fb1cda660	[PAC][ELF][AArch64] Support signed personality function pointer (#113148 ) If function pointer signing is enabled, sign personality function pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key, 0x7EAD = `ptrauth_string_discriminator("personality")` constant discriminator and address diversity enabled.	2024-12-10 08:48:09 +03:00
Jordan Rupprecht	a6b5e18fc6	[test][clang][AArch64] Don't assume current dir is writeable (#119285 ) afa2fbf87a8e3fff609fd325c938929c48e94280 adds a test which can fail with `error: unable to open output file 'fixed-register-global.o': 'Permission denied'`. We don't check the output file at all, so just use /dev/null.	2024-12-09 20:33:13 -06:00
Thurston Dang	fd57946cc4	[NFC][clang] Update ubsan-trap-merge.c test to show absence of nomerge in non-trap mode (#119280 ) This shows that ubsan handlers do not have nomerge attributes in non-trap mode, even if -ubsan-unique-trap is enabled. 0d15d46362bd6ab5a9a2165805adaab13a7689f4 attaches nomerge but only for trap mode. --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>	2024-12-09 16:21:22 -08:00
Lei Huang	a13ec9cd54	[PowerPC] Update data layout aligment of i128 to 16 (#118004 ) Fix 64-bit PowerPC part of https://github.com/llvm/llvm-project/issues/102783.	2024-12-09 18:02:24 -05:00
Nikita Popov	10f315dc9c	[ConstantFolding] Infer getelementptr nuw flag (#119214 ) Infer nuw from nusw and nneg. This is the constant expression variant of https://github.com/llvm/llvm-project/pull/111144. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-09 16:44:05 +01:00
SpencerAbson	99f6ca9b7b	[AArch64] Implement intrinsics for SME FP8 FMOPA (#118115 ) This patch implements the following intrinsics: 8-bit floating-point sum of outer products and accumulate. ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmopa_za16[_mf8]_m_fpm(uint64_t tile, svbool_t pn, svbool_t pm, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmopa_za32[_mf8]_m_fpm(uint64_t tile, svbool_t pn, svbool_t pm, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with: https://github.com/ARM-software/acle/pull/323/ Co-authored-by: Momchil Velikov momchil.velikov@arm.com Co-authored-by: Marian Lukac marian.lukac@arm.com	2024-12-09 11:13:08 +00:00
c8ef	f145ff3f70	[clang] constexpr built-in elementwise add_sat/sub_sat functions. (#119082 ) Part of #51787. This patch adds constexpr support for the built-in elementwise add_sat and sub_sat functions.	2024-12-09 09:28:12 +08:00
SpencerAbson	b0f06769e6	[AArch64] Implement intrinsics for SME FP8 F1CVT/F2CVT and BF1CVT/BF2CVT (#118027 ) This patch implements the following intrinsics: 8-bit floating-point convert to half-precision or BFloat16 (in-order). ``` c // Variant is also available for: _bf16[_mf8]_x2 svfloat16x2_t svcvt1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming; svfloat16x2_t svcvt2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming; ``` In accordance with https://github.com/ARM-software/acle/pull/323. Co-authored-by: Marin Lukac marian.lukac@arm.com Co-authored-by: Caroline Concatto caroline.concatto@arm.com	2024-12-08 19:34:01 +00:00
Vitaly Buka	7787328dd6	[ubsan] Improve lowering of @llvm.allow.ubsan.check (#119013 ) This fix the case, when single hot inlined callsite, prevent checks for all other. This helps to reduce number of removed checks up to 50% (deppedes on `cutoff-hot` value) . `ScalarOptimizerLateEPCallback` was happening during CGSCC walk, after each inlining, but this is effectively after inlining. Example, order in comments: ``` static void overflow() { // 1. Inline get/set if possible // 2. Simplify // 3. LowerAllowCheckPass set(get() + get()); } void test() { // 4. Inline // 5. Nothing for LowerAllowCheckPass overflow(); } ``` With this patch it will look like: ``` static void overflow() { // 1. Inline get/set if possible // 2. Simplify set(get() + get()); } void test() { // 3. Inline // 4. Simplify overflow(); } // Later, after inliner CGSCC walk complete: // 5. LowerAllowCheckPass for `overflow` // 6. LowerAllowCheckPass for `test` ```	2024-12-07 16:12:58 -08:00
Vitaly Buka	66f9448b4b	[NFC][ubsan] Pre-commit test with missed optimization (#119012 )	2024-12-07 14:50:19 -08:00
Igor Kudrin	afa2fbf87a	[Reland][clang][AArch64] Avoid a crash when a non-reserved register is used (#117419 ) Relanding the patch with a fix for a test failure on build bots that do not build LLVM for AArch64. Fixes #76426, #109778 (for AArch64) The previous patch for this issue, #94271, generated an error message if a register and a global variable did not have the same size. This patch checks if the register is reserved.	2024-12-06 16:13:36 -08:00
Ulrich Weigand	8787bc72a6	Revert "[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642 )" This reverts commit 030bbc92a705758f1131fb29cab5be6d6a27dd1f.	2024-12-07 00:55:54 +01:00
Igor Kudrin	da65fe1c16	Revert "[clang][AArch64] Avoid a crash when a non-reserved register is used (#117419 )" This reverts commit 8fc6fca9f28ce20d76066be66fcc41aa38f7dc3d.	2024-12-06 15:10:40 -08:00
Igor Kudrin	8fc6fca9f2	[clang][AArch64] Avoid a crash when a non-reserved register is used (#117419 ) Fixes #76426, #109778 (for AArch64) The previous patch for this issue, #94271, generated an error message if a register and a global variable did not have the same size. This patch checks if the register is reserved.	2024-12-06 14:58:10 -08:00
anoopkg6	030bbc92a7	[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642 ) Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ.	2024-12-06 23:33:33 +01:00
Nikita Popov	b569ec6de6	[SCCP] Infer nuw for gep nusw with non-negative offsets (#118819 ) If the GEP is nusw/inbounds and has all-non-negative offsets infer nuw as well. This doesn't have measurable compile-time impact. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-06 09:52:32 +01:00
Oliver Stannard	f893b47500	[ARM] Fix instruction selection for MVE vsbciq intrinsic (#118284 ) There were two bugs in the implementation of the MVE vsbciq (subtract with carry across vector, with initial carry value) intrinsics: * The VSBCI instruction behaves as if the carry-in is always set, but we were selecting it when the carry-in is clear. * The vsbciq intrinsics should generate IR with the carry-in set, but they were leaving it clear. These two bugs almost cancelled each other out, but resulted in incorrect code when the vsbcq intrinsics (with a carry-in) were used, and the carry-in was a compile time constant.	2024-12-06 08:46:56 +00:00

1 2 3 4 5 ...

9571 Commits