llvm-project

Author	SHA1	Message	Date
Andrzej Warzyński	2826924543	[CIR][AArch64] Add support for the remaining `vceqz` builtins (#185440 ) Implement the remaining CIR lowerings for the AdvSIMD (Neon) `vceqz` intrinsic group (bitwise equal to zero). Most variants of `vceqz` variant were already supported; this patch completes the rest of the group [1] that was left as a TODO. Tests for these intrinsics are moved from: * test/CodeGen/AArch64/neon_intrinsics.c * test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c to: * test/CodeGen/AArch64/neon/intrinsics.c * test/CodeGen/AArch64/neon/fullfp16, respectively. The implementation largely mirrors the existing lowering in CodeGen/TargetBuiltins/ARM.cpp. Reference: [1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#bitwise-equal-to-zero	2026-03-10 12:58:06 +00:00
Andrzej Warzyński	b68dcf93b4	[Clang][AArch64] Clarify and simplify SISD intrinsic handling (NFC) (#185285 ) Not all AArch64 intrinsics categorized as SISD (Single Instruction Single Data) are truly SISD. Add comments clarifying this distinction. Also update EmitCommonNeonSISDBuiltinExpr: * Move the assert to the top of the function and add a descriptive message to make the assumptions explicit. * Remove unnecessary temporary variables (e.g. BuiltinID) and use SISDInfo directly. No functional changes intended.	2026-03-10 11:39:52 +00:00
Andrzej Warzyński	b5be6599b9	[CIR][AArch64] Add missing lowerings for vceqz_* Neon builtins (#184893 ) Implement the remaining CIR lowerings for the AdvSIMD (Neon) `vceqz{\|q\|d\|s}_` intrinsic group (bitwise equal to zero). The `vceqzd_s64` variant was already supported; this patch completes the rest of the group [1]. Tests for these intrinsics are moved from: test/CodeGen/AArch64/neon-misc.c to: * test/CodeGen/AArch64/neon/intrinsics.c The implementation largely mirrors the existing lowering in CodeGen/TargetBuiltins/ARM.cpp. `emitCommonNeonBuiltinExpr` is introduced to support these lowerings. `getNeonType` is moved without functional changes. Reference: [1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#bitwise-equal-to-zero	2026-03-05 22:07:41 +00:00
Andrzej Warzyński	aabae9dcc3	[Clang][CIR][AArch64] NFC: Cleanups in AArch64 builtins lowering (#184404 ) This patch performs small cleanups and fixes in the AArch64 builtins lowering code, with the goal of aligning the CIR path more closely with the existing Clang CodeGen implementation. Changes include: * Make sure that `noundef` is consistently matched using `{{.}}`. Rename `AArch64BuiltinInfo` to `armVectorIntrinsicInfo` for better consistency with the original CodeGen implementation. * Simplify `emitAArch64CompareBuiltinExpr`, fix an incorrect assert condition (missing `!`) and make sure to use the input `kind` condition instead of hard-coding `cir::CmpOpKind::eq`. * Improve and clarify comments. No functional changes intended (NFC).	2026-03-05 21:24:10 +00:00
Jonathan Thackray	6d003f5033	[AArch64][clang][llvm] Add ACLE `stshh` atomic store builtin (#181386 ) Add `__arm_atomic_store_with_stshh` implementation as defined in the ACLE. Validate arguments passed are correct, and lower to the `stshh` intrinsic plus an atomic store using a pseudo-instruction with the allowed orderings: * memory orderings: relaxed, release, seq_cst * retention policies: keep, strm The `STSHH` instruction (Store with Store Hint for Hardware) is part of the `FEAT_PCDPHINT` extension.	2026-03-05 17:02:36 +00:00
Andrzej Warzyński	ae76def769	[clang][ARM] Refactor argument handling in `EmitAArch64BuiltinExpr` (3/N) (NFC) (#183315 ) Remove the outstanding calls to `EmitScalarExpr` in `EmitAArch64BuiltinExpr` that are no longer required. This is a follow-up for #181794 and #181974 - please refer to those PRs for more context.	2026-02-25 18:34:20 +00:00
Aaron Ballman	b960e6ecab	Silence "switch statement contains default but not case labels"; NFC (#182855 ) Silences an MSVC C4065 diagnostic that was introduced in 0dd1cb015e8b1439e70c152eb134abb01e1af831	2026-02-23 11:51:39 -05:00
Andrzej Warzyński	0dd1cb015e	[clang][ARM] Refactor argument handling in `EmitAArch64BuiltinExpr` (2/2) (NFC) (#181974 ) Refactor `EmitAArch64BuiltinExpr` so that all AArch64/NEON builtins handled by this hook _and marked as overloaded_ share a common path for generating LLVM IR arguments (collected into the `Ops` `SmallVector<Value>`) (). This is a follow-up for #181794 - please refer to that PR for more context. As in the previous PR, the key change is implemented in `HasExtraNeonArgument` , i.e. in the hook that identifies Builtins with the extra argument. In this PR, I am replacing the ad-hoc switch statement with a more principled approach borrowed from SemaARM.cpp, namely: ```cpp static bool HasExtraNeonArgument(unsigned BuiltinID) { // (...) uint64_t mask = 0; switch (BuiltinID) { #define GET_NEON_OVERLOAD_CHECK #include "clang/Basic/arm_fp16.inc" #include "clang/Basic/arm_neon.inc" #undef GET_NEON_OVERLOAD_CHECK // Non-neon builtins for controling VFP that take extra argument for // discriminating the type. case ARM::BI__builtin_arm_vcvtr_f: case ARM::BI__builtin_arm_vcvtr_d: mask = 1; } switch (BuiltinID) { default: break; } if (mask) return true; return false; ``` This is preferred because the extra argument is defined for Sema verification. CodeGen should reuse the same source of truth rather than duplicating or partially reimplementing the logic. No functional change intended. (*) `EmitAArch64BuiltinExpr` contains two large switch statements intended to separate handling of non-overloaded and overloaded builtins. In practice, the split is not consistently enforced. Patch 1/2 refactored the first switch (non-overloaded path). This patch applies the same cleanup to the overloaded path and completes the refactoring.	2026-02-20 19:44:38 +00:00
Andrzej Warzyński	2d69ab262b	[clang][ARM] Refactor argument handling in `EmitAArch64BuiltinExpr` (1/2) (NFC) (#181794 ) Refactor `EmitAArch64BuiltinExpr` so that all AArch64/NEON builtins handled by this hook _and marked as non-overloaded_ share a common path for generating LLVM IR arguments (collected into the `Ops` `SmallVector<Value>`) () Previously, the argument emission loop unconditionally skipped the trailing argument: ```cpp for (unsigned i = 0, e = E->getNumArgs() - 1; i != e; ++i) ``` This was originally intended to ignore the extra Sema-only argument used by overloaded NEON builtins (e.g. the type discriminator passed by `__builtin_neon_` intrinsics). However, this logic was applied unconditionally. This patch updates the loop to skip the trailing argument only when `HasExtraNeonArgument` returns true for non-SISD builtins: ```cpp bool HasExtraArg = !IsSISD && HasExtraNeonArgument(BuiltinID); unsigned NumArgs = E->getNumArgs() - (HasExtraArg ? 1 : 0); for (unsigned i = 0, e = NumArgs; i != e; ++i) ``` This preserves existing IR generation behaviour while making the handling of Sema-only NEON discriminator arguments explicit. For context, type discriminators can be found in definitions of various builtins in `arm_neon.h`. For example, `vsriq_n_p64(<args>)` expands into the following call: ```cpp __builtin_neon_vsriq_n_v(<args>, 38) ``` The trailing `38` encodes the concrete NEON vector type (e.g. `poly64x2_t`) for overload resolution in Sema; it is not semantically part of the operation and is ignored during IR generation. As part of this change, `HasExtraNeonArgument` was completed so that these discriminator arguments are correctly identified. No functional change intended. () This refers to two large `switch` stmts inside `EmitAArch64BuiltinExpr` that are meant to switch the processing into non-overloaded and overloaded builtins. The intended split between non-overloaded and overloaded builtins is not consistently enforced: the second switch (nominally handling overloaded builtins) also processes some non-overloaded cases. This patch refactors only the first switch and prepares for a follow-up cleanup in 2/2.	2026-02-20 17:41:28 +00:00
Andrzej Warzyński	22985fe1f9	[clang][Builtins][ARM] NFC updates in ARM.cpp (#180966 ) Updates the logic in `CodeGenFunction::EmitAArch64BuiltinExpr` so that we always start with the general code and we only fall-back to specialised cases (i.e. `switch` stmts) for intrinsics for which the general code does no apply. BEFORE (only high-level: ```cpp Value CodeGenFunction::EmitAArch64BuiltinExpr() { (...) /// 1. SWITCH STMT FOR NON-OVERLOADED INTRINSIS switch (BuiltinID) { default break: case NEON::BI__builtin_neon_vabsh_f16: (...) } /// 2. GENERAL CODE Builtin = findARMVectorIntrinsicInMap(AArch64SIMDIntrinsicMap, BuiltinID, AArch64SIMDIntrinsicsProvenSorted); if (Builtin) return EmitCommonNeonBuiltinExpr( Builtin->BuiltinID, Builtin->LLVMIntrinsic, Builtin->AltLLVMIntrinsic, Builtin->NameHint, Builtin->TypeModifier, E, Ops, /never use addresses/ Address::invalid(), Address::invalid(), Arch); if (Value V = EmitAArch64TblBuiltinExpr(this, BuiltinID, E, Ops, Arch)) return V; /// 3. SWITCH STMT FOR THE REMAINING INTRINSIS switch (BuiltinID) { default return nullptr: case NEON::BI__builtin_neon_vbsl_v: (...) } } ``` AFTER: ```cpp Value CodeGenFunction::EmitAArch64BuiltinExpr() { /// 1. GENERAL CODE Builtin = findARMVectorIntrinsicInMap(AArch64SIMDIntrinsicMap, BuiltinID, AArch64SIMDIntrinsicsProvenSorted); if (Builtin) return EmitCommonNeonBuiltinExpr( Builtin->BuiltinID, Builtin->LLVMIntrinsic, Builtin->AltLLVMIntrinsic, Builtin->NameHint, Builtin->TypeModifier, E, Ops, /never use addresses/ Address::invalid(), Address::invalid(), Arch); if (Value V = EmitAArch64TblBuiltinExpr(this, BuiltinID, E, Ops, Arch)) return V; /// 2. SWITCH STMT FOR NON-OVERLOADED INTRINSIS switch (BuiltinID) { default break: case NEON::BI__builtin_neon_vabsh_f16: (...) } /// 3. SWITCH STMT FOR THE REMAINING INTRINSIS switch (BuiltinID) { default return nullptr: case NEON::BI__builtin_neon_vbsl_v: (...) } } ``` In addition: * Remove `vaddq_p128+ vcvtq_high_bf16_f32 + vcvtq_low_bf16_f32` from `AArch64SIMDIntrinsicMap`. Those were not required there (it's an array for intrinsics for which the general code-gen works, but that's not the case for those). * Extracted the declaration of `Int` so that it can be re-used.	2026-02-12 09:48:52 +00:00
Andrzej Warzyński	ec18b92815	[CIR][NEON] Add lowering for `vnegd_s64` and `vnegh_f16` (#180597 ) Add CIR lowering support for the non-overloaded NEON intrinsics `vnegd_s64` and `vnegh_f16`. The associated tests are shared with the existing default codegen tests: * `neon-intrinsics.c` → `neon/intrinsics.c` * `v8.2a-fp16-intrinsics.c` → `neon/fullfp16.c` A new test file, * `clang/test/CodeGen/AArch64/neon/fullfp16.c` is introduced and is intended to eventually replace: * `clang/test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c` Since both intrinsics are non-overloaded, the CIR and default codegen handling is moved to the appropriate switch statements. The previous placement was incorrect. This change also includes minor refactoring in `CIRGenBuilder.h` to better group related hooks.	2026-02-11 11:10:57 +00:00
Andrzej Warzyński	1d13412cd3	[clang][nfc] Remove `else` after `return` in ARM.cpp (#180733 ) Align with the LLVM coding standard: * https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return	2026-02-10 16:35:52 +00:00
Andrzej Warzyński	80677dc5e0	[CIR][NEON] Add lowering support for `vceqzd_s64` (#179779 ) Rather than creating a dedicated ClangIR test file, the original test file for this intrinsic is effectively reused: * clang/test/CodeGen/AArch64/neon-intrinsics.c “Effectively” meaning that the corresponding test is moved (rather than literally reused) to a new file within the original AArch64 builtins test directory: * clang/test/CodeGen/AArch64/neon/intrinsics.c This is necessary to avoid lowering unsupported examples from intrinsics.c with `-fclangir`. The new file will eventually replace the original one once all builtins from it can be lowered via ClangIR. To facilitate test re-use, new LIT "feature" is added so that CIR tests can be run conditionally, e.g. the following will only run when `CLANG_ENABLE_CIR` is set: ```C // RUN: %if cir %{%clang_cc1 ... %} ``` This sort of substitutions are documented in [2]. REFERENCES: [1] https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vceqzd_s64 [2] https://llvm.org/docs/TestingGuide.html#substitutions	2026-02-09 18:48:42 +00:00
Kerry McLaughlin	04e5bc7dfb	[AArch64] Add support for range prefetch intrinsic (#170490 ) This patch adds support in Clang for the RPRFM instruction, by adding the following intrinsics: ``` void __pldx_range(unsigned int access_kind, unsigned int retention_policy, signed int length, unsigned int count, signed int stride, size_t reuse distance, void const addr); void __pld_range(unsigned int access_kind, unsigned int retention_policy, uint64_t metadata, void const addr); ``` The `__ARM_PREFETCH_RANGE` macro can be used to test whether these intrinsics are implemented. If the RPRFM instruction is not available, this instruction is a NOP. This implements the following ACLE proposal: https://github.com/ARM-software/acle/pull/423	2026-01-12 15:53:17 +00:00
Nikita Popov	3186ca25bc	[ARM] Use getSigned() for signed value	2025-12-17 12:48:51 +01:00
David Green	1a1c5df7f9	[ARM] Introduce intrinsics for MVE fp-converts under strict-fp. (#170686 ) This is the last of the generic instructions created from MVE intrinsics. It was a little more awkward than the others due to it taking a Type as one of the arguments. This creates a new function to create the intrinsic we need.	2025-12-14 12:12:45 +00:00
Amina Chabane	7f2e6f128d	[Clang][AArch64] Implement widening FMMLA intrinsics (#165282 ) Proposed in [this ACLE proposal](https://github.com/ARM-software/acle/pull/409), this PR implements widening FMMLA intrinsics. - F16 to F32 - MF8 to F32 - MF8 to F16 Additional changes: - IsOverloadCvt flag renamed to IsOverloadFirstandLast for clarity, as the name implies conversion. Implementation remains unchanged.	2025-12-05 16:08:25 +00:00
Jonathan Thackray	7377ac037d	[AArch64][llvm] Add support for Neon vmmlaq_{f16,f32}_mf8_fpm intrinsics (#165431 ) Add support for the following new AArch64 Neon intrinsics: ``` float16x8_t vmmlaq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t); float32x4_t vmmlaq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t); ```	2025-11-07 15:24:13 +00:00
Jonathan Thackray	9a8781b86f	[AArch64][llvm] Add support for new vcvt* intrinsics (#163572 ) Add support for these new vcvt* intrinsics: ``` int64_t vcvts_s64_f32(float32_t); uint64_t vcvts_u64_f32(float32_t); int32_t vcvtd_s32_f64(float64_t); uint32_t vcvtd_u32_f64(float64_t); int64_t vcvtns_s64_f32(float32_t); uint64_t vcvtns_u64_f32(float32_t); int32_t vcvtnd_s32_f64(float64_t); uint32_t vcvtnd_u32_f64(float64_t); int64_t vcvtms_s64_f32(float32_t); uint64_t vcvtms_u64_f32(float32_t); int32_t vcvtmd_s32_f64(float64_t); uint32_t vcvtmd_u32_f64(float64_t); int64_t vcvtps_s64_f32(float32_t); uint64_t vcvtps_u64_f32(float32_t); int32_t vcvtpd_s32_f64(float64_t); uint32_t vcvtpd_u32_f64(float64_t); int64_t vcvtas_s64_f32(float32_t); uint64_t vcvtas_u64_f32(float32_t); int32_t vcvtad_s32_f64(float64_t); uint32_t vcvtad_u32_f64(float64_t); ```	2025-11-07 14:56:29 +00:00
Paul Walker	d929146b3f	[Clang][AArch64] Lower NEON vaddv/vminv/vmaxv builtins to llvm.vector.reduce intrinsics. (#165400 ) This is the first step in removing some NEON reduction intrinsics that duplicate the behaviour of their llvm.vector.reduce counterpart. NOTE: The i8/i16 variants differ in that the NEON versions return an i32 result. However, this looks more about making their code generation convenient with SelectionDAG disgarding the extra bits. This is only relevant for the next phase because the Clang usage always truncate their result, making llvm.vector.reduce a drop in replacement.	2025-10-30 10:46:37 +00:00
Juan Manuel Martinez Caamaño	74d77dc2ec	[Clang][NFC] Rename UnqualPtrTy to DefaultPtrTy (#163207 ) `UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`: sometimes it returned a pointer that is not in address space 0 (notably for SPIRV). Since `UnqualPtrTy` was used as the "generic" or "default" pointer type, this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's `PointerType::getUnqual`.	2025-10-20 14:34:21 +02:00
Kazu Hirata	4412cfa854	[clang] Use [[fallthrough]] instead of LLVM_FALLTHROUGH (NFC) (#163085 ) [[fallthrough]] is now part of C++17, so we don't need to use LLVM_FALLTHROUGH.	2025-10-12 20:49:11 -07:00
Kerry McLaughlin	ccaeebcd04	[AArch64][SME] Improve codegen for aarch64.sme.cnts* when not in streaming mode (#154761 ) Builtins for reading the streaming vector length are canonicalised to use the aarch64.sme.cntsd intrinisic and a multiply, i.e. - cntsb -> cntsd * 8 - cntsh -> cntsd * 4 - cntsw -> cntsd * 2 This patch also removes the LLVM intrinsics for cnts[b,h,w], and adds patterns to improve codegen when cntsd is multiplied by a constant.	2025-09-12 10:23:57 +01:00
Kajetan Puchalski	b96fa9f3ac	[clang][AArch64] Use .i16.f16 intrinsic formats for vcvth*_[s\|u]16_f16 (#156029 ) Use .i16.f16 intrinsic formats for intrinsics like vcvth_s16_f16. Avoids issues with incorrect saturation that arise when using .i32.f16 formats for the same conversions. Fixes https://github.com/llvm/llvm-project/issues/154343. Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-09-02 11:38:37 +01:00
Nikita Popov	246a64a12e	[Clang] Rename HasLegalHalfType -> HasFastHalfType (NFC) (#153163 ) This option is confusingly named. What it actually controls is whether, under the default of `-ffloat16-excess-precision=standard`, it is beneficial for performance to perform calculations on float (without intermediate rounding) or not. For `-ffloat16-excess-precision=none` the LLVM `half` type will always be used, and all backends are expected to legalize it correctly.	2025-08-18 09:23:48 +02:00
Benjamin Maxwell	af44d87e0d	[clang][SME] Remove folding of `__arm_in_streaming_mode()` (NFC) (#150917 ) This is handled by the instcombine added in #147930; there is no need for any clang-specific folding. NFC as all clang tests for `__arm_in_streaming_mode()` used -O1, which applies the LLVM instcombines.	2025-07-29 10:42:45 +01:00
Alexandros Lamprineas	3ab64c5b29	[NFC][Clang][FMV] Make FMV priority data type future proof. (#150079 ) FMV priority is the returned value of a polymorphic function. On RISC-V and X86 targets a 32-bit value is enough. On AArch64 we currently need 64 bits and we will soon exceed that. APInt seems to be a suitable replacement for uint64_t, presumably with minimal compile time overhead. It allows bit manipulation, comparison and variable bit width.	2025-07-23 10:37:29 +01:00
David Green	9fcea2e465	[ARM] Add neon vector support for roundeven As per #142559, this marks froundeven as legal for Neon and upgrades the existing arm.neon.vrintn intrinsics.	2025-07-04 15:27:33 +01:00
David Green	ec35065789	[ARM] Add neon vector support for rint As per #142559, this marks frint as legal for Neon and upgrades the existing arm.neon.vrintx intrinsics.	2025-07-03 21:27:48 +01:00
David Green	1f8f477bd0	[ARM] Add neon vector support for trunc As per #142559, this marks ftrunc as legal for Neon and upgrades the existing arm.neon.vrintz intrinsics.	2025-07-03 07:41:13 +01:00
Adam Glass	ed27f18e32	__sys builtin support for AArch64 (#146456 ) Adds support for __sys Clang builtin for AArch64 __sys is a long existing MSVC intrinsic used to manage caches, tlbs, etc by writing to system registers: * It takes a macro-generated constant and uses it to form the AArch64 SYS instruction which is MSR with op0=1. The macro drops op0 and expects the implementation to hardcode it to 1 in the encoding. * Volume use is in systems code (kernels, hypervisors, boot environments, firmware) * Has an unused return value due to MSVC cut/paste error Implementation: * Clang builtin, sharing code with Read/WriteStatusReg * Hardcodes the op0=1 * Explicitly returns 0 * Code-format change from clang-format * Unittests included * Not limited to MSVC-environment as its generally useful and neutral	2025-07-02 10:17:01 -07:00
David Green	5332534b9c	[ARM] Add neon vector support for ceil As per #142559, this marks fceil as legal for Neon and upgrades the existing arm.neon.vrintp intrinsics.	2025-07-01 15:41:10 +01:00
David Green	6bd9ff04af	[ARM] Add neon vector support for round As per #142559, this marks fround as legal for Neon and upgrades the existing arm.neon.vrinta intrinsics.	2025-06-30 17:15:26 +01:00
David Green	dcc9e36b18	[ARM] Add neon vector support for floor (#142559 ) This marks ffloor as legal providing that armv8 and neon is present (or fullfp16 for the fp16 instructions). The existing arm_neon_vrintm intrinsics are auto-upgraded to llvm.floor. If this is OK I will update the other vrint intrinsics.	2025-06-29 11:37:16 +01:00
Adam Glass	d9a7b16479	InterlockedAdd_, InterlockedAdd64_ support for AArch64 (#145607 ) This PR adds support for InterlockedAdd_{acq, nf, rel}, and InterlockedAdd64_{acq, nf, rel} for Aarch64.	2025-06-25 12:09:30 -07:00
Kazu Hirata	ae372bfca8	[CodeGen] Use range-based for loops (NFC) (#145142 )	2025-06-21 08:20:57 -07:00
Paul Walker	f43aaf90df	[NFC][LLVM] Refactor IRBuilder::Create{VScale,ElementCount,TypeSize}. (#142803 ) CreateVScale took a scaling parameter that had a single use outside of IRBuilder with all other callers having to create a redundant ConstantInt. To work round this some code perferred to use CreateIntrinsic directly. This patch simplifies CreateVScale to return a call to the llvm.vscale() intrinsic and nothing more. As well as simplifying the existing call sites I've also migrated the uses of CreateIntrinsic. Whilst IRBuilder used CreateVScale's scaling parameter as part of the implementations of CreateElementCount and CreateTypeSize, I have follow-on work to switch them to the NUW varaiety and thus they would stop using CreateVScale's scaling as well. To prepare for this I have moved the multiplication and constant folding into the implementations of CreateElementCount and CreateTypeSize. As a final step I have replaced some callers of CreateVScale with CreateElementCount where it's clear from the code they wanted the latter.	2025-06-10 12:35:59 +01:00
Lukacma	6fc0312919	[Clang][AArch64] Add fp8 variants for untyped NEON intrinsics (#128019 ) This patch adds fp8 variants to existing intrinsics, whose operation doesn't depend on arguments being a specific type. It also changes mfloat8 type representation in memory from `i8` to `<1xi8>`	2025-05-15 14:01:41 +01:00
Craig Topper	123758b1f4	[IRBuilder] Add versions of createInsertVector/createExtractVector that take a uint64_t index. (#138324 ) Most callers want a constant index. Instead of making every caller create a ConstantInt, we can do it in IRBuilder. This is similar to createInsertElement/createExtractElement.	2025-05-02 16:10:18 -07:00
Nikita Popov	b384d6d6cc	[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100 ) This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.	2025-04-03 08:04:19 +02:00
Lukacma	6c3adaafe3	[AARCH64][Neon] switch to using bitcasts in arm_neon.h where appropriate (#127043 ) Currently arm_neon.h emits C-style casts to do vector type casts. This relies on implicit conversion between vector types to be enabled, which is currently deprecated behaviour and soon will disappear. To ensure NEON code will keep working afterwards, this patch changes all this vector type casts into bitcasts. Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>	2025-04-01 09:45:16 +01:00
Jonathan Thackray	a1a74c9e80	[NFC][clang] Remove superfluous header files after refactor in #132252 (#132495 ) Remove superfluous header files after refactor in #132252	2025-03-26 14:45:00 +00:00
Jonathan Thackray	7f920e2e5f	[NFC][clang] Split clang/lib/CodeGen/CGBuiltin.cpp into target-specific files (#132252 ) clang/lib/CodeGen/CGBuiltin.cpp is over 1MB long (>23k LoC), and can take minutes to recompile (depending on compiler and host system) when modified, and 5 seconds for clangd to update for every edit. Splitting this file was discussed in this thread: https://discourse.llvm.org/t/splitting-clang-s-cgbuiltin-cpp-over-23k-lines-long-takes-1min-to-compile/ and the idea has received a number of +1 votes, hence this change.	2025-03-21 19:09:39 +00:00

43 Commits