llvm-project

Author	SHA1	Message	Date
Nathan Gauër	20d70196c9	[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530 ) This variable attribute is used in HLSL to add Vulkan specific builtins in a shader. The attribute is documented here: `17727e88fd/proposals/0011-inline-spirv.md` Those variable, even if marked as `static` are externally initialized by the pipeline/driver/GPU. This is handled by moving them to a specific address space `hlsl_input`, also added by this commit. The design for input variables in Clang can be found here: `355771361e/proposals/0019-spirv-input-builtin.md` Co-authored-by: Justin Bogner <mail@justinbogner.com>	2025-06-04 13:22:37 +02:00
Ami-zhang	06f779b69d	Reland "[Clang][LoongArch] Support target attribute for function" (#142546 ) This relands #140700. I have updated the test case('targetattr.c') to resolve the test failure. Original PR resulted in test fail: https://lab.llvm.org/buildbot/#/builders/11/builds/16173 https://lab.llvm.org/buildbot/#/builders/202/builds/1531 Original description: Followup to #140700.	2025-06-03 15:57:50 +08:00
Ami-zhang	8c65f68330	[clang][LoongArch] Add support for the _Float16 type (#141703 ) Enable _Float16 for LoongArch target. Additionally, this change fixes incorrect ABI lowering of _Float16 in the case of structs containing fp16 that are eligible for passing via GPR+FPR or FPR+FPR. Finally, it also fixes int16 -> __fp16 conversion code gen, which uses generic LLVM IR rather than llvm.convert.to.fp16 intrinsics.	2025-06-03 14:26:11 +08:00
Tomas Matheson	832a7bb460	[AArch64] Add missing Neon Types (#126945 ) The AAPCS64 adds a number of vector types to the C unconditionally: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#11appendix-support-for-advanced-simd-extensions The equivalent SVE types are already available in clang: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#12appendix-support-for-scalable-vectors __mfp8 is defined in the ACLE https://arm-software.github.io/acle/main/acle.html#data-types --------- Co-authored-by: David Green <david.green@arm.com>	2025-06-02 17:09:35 +01:00
Nathan Gauër	df5f65d22a	[SPIR-V] Only emit __spirv__ when targeting HLSL (#142401 ) OpenCL translator has a `__spirv` namespace, and defining the `__spirv__` macro causes issues downstream on the OpenCL side. This macro is needed to keep compatibility with HLSL/DXC, but can be avoided for other targets/languages.	2025-06-02 11:51:53 -04:00
Kazu Hirata	cd9fe8a34c	[Basic] Remove unused includes (NFC) (#142295 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-31 19:00:31 -07:00
Paul Kirth	d93788fcbf	Revert "[Clang][LoongArch] Support target attribute for function" (#141998 ) Reverts llvm/llvm-project#140700 This breaks bots both in buildbot and downstream CI: - https://lab.llvm.org/buildbot/#/builders/11/builds/16173 - https://lab.llvm.org/buildbot/#/builders/202/builds/1531 - https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-host-linux-x64/b8713537585914796017/overview	2025-05-29 11:26:44 -07:00
Victor Lomuller	c474f8f240	[clang][SPIRV] Add builtin for OpGenericCastToPtrExplicit and its SPIR-V friendly binding (#137805 ) The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic. The SPIR-V builtins are now split into 3 differents file: BuiltinsSPIRVCore.td, BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for OpenCL specific builtins and BuiltinsSPIRVCommon.td for common ones. The patch also introduces a new header defining its SPIR-V friendly equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal, __spirv_GenericCastToPtrExplicit_ToLocal and __spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared as aliases to the new builtin allowing C-like languages to have a definition to rely on as well as gaining proper front-end diagnostics. The motivation for the header is to provide a stable binding for applications or library (such as SYCL) and allows non SPIR-V targets to provide an implementation (via libclc or similar to how it is done for gpuintrin.h).	2025-05-29 15:19:40 +02:00
Ami-zhang	b359422eeb	[Clang][LoongArch] Support target attribute for function (#140700 ) This adds support under LoongArch for the target("..") attributes. The supported formats are: - "arch=<arch>" strings, that specify the architecture features for a function as per the -march=arch option. - "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as per -mtune. - "<feature>", "no-<feature>" enabled/disables the specific feature.	2025-05-29 19:54:48 +08:00
Steven Perron	5584020d8a	[HLSL][SPIRV] Implement the SPIR-V target type for cbuffers. (#140061 ) This change implement the type used to represent cbuffer for SPIR-V. Fixes https://github.com/llvm/llvm-project/issues/138274.	2025-05-28 07:51:03 -04:00
CarolineConcatto	7569de5272	[Clang][AArch64]Add FP8 ACLE macros implementation (#140591 ) This patch implements the macros described in the ACLE[1] [1] https://github.com/ARM-software/acle/blob/main/main/acle.md#modal-8-bit-floating-point-extensions	2025-05-27 10:01:38 +01:00
hev	689342de25	[Clang][LoongArch] Add inline asm support for the `q` constraint (#141037 ) This patch adds support for the `q` constraint: a general-purpose register except for $r0 and $r1 (for the csrxchg instruction) Link: https://gcc.gnu.org/pipermail/gcc-patches/2025-May/684339.html	2025-05-23 11:14:41 +08:00
Fraser Cormack	6553dc30b8	[NVPTX] Support the OpenCL generic addrspace feature by default (#137940 ) As best as I can see, all NVPTX architectures support the generic address space. I note there's a FIXME in the target's address space map about 'generic' still having to be added to the target but we haven't observed any issues with it downstream. The generic address space is mapped to the same target address space as default/private (0), but this isn't necessarily a problem for users.	2025-05-21 09:55:11 +01:00
Jim Lin	d561d595c4	[RISCV] Implement intrinsics for XAndesVPackFPH (#140007 ) This patch implements clang intrinsic support for XAndesVPackFPH. The document for the intrinsics can be found at: https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs.adoc#andes-vector-packed-fp16-extensionxandesvpackfph and with policy variants https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/policy_funcs/intrinsic_funcs.adoc#andes-vector-packed-fp16-extensionxandesvpackfph Co-authored-by: Tony Chuan-Yue Yuan <yuan593@andestech.com>	2025-05-20 13:16:51 +08:00
Alexander Richardson	07e2ba445d	[AMDGPU] Set AS8 address width to 48 bits Of the 128-bits of buffer descriptor only 48 bits are address bits, so following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54, the logic conclusion is to set the index width to 48 bits instead of the current value of 128. Most of the test changes are mechanical datalayout updates, but there is one actual change: the ptrmask test now uses .i48 instead of .i128 and I had to update SelectionDAGBuilder to correctly extend the mask. Reviewed By: krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139419	2025-05-19 17:26:05 -07:00
Ming-Yi Lai	c28d6c2f5f	[Clang][RISCV] Add Zicfilp CFI unlabeled scheme preprocessor macros (#109600 ) This patch adds preprocessor macros when Zicfilp CFI is enabled. To be specific: + `#define __riscv_landing_pad 1` when `-fcf-protection=[full\|branch]` + `#define __riscv_landing_pad_unlabeled 1` when `-fcf-protection=[full\|branch] -mcf-branch-label-scheme=unlabeled` The macros are proposed in riscv-non-isa/riscv-c-api-doc#76 , and the CLI flags are from riscv-non-isa/riscv-toolchain-conventions#54.	2025-05-19 18:39:31 +08:00
Kazu Hirata	9adcb4fe12	[clang] Use llvm::replace (NFC) (#140264 )	2025-05-16 09:06:31 -07:00
Matthew Devereau	22576e2cce	[Clang][AArch64] Add pessimistic vscale_range for sve/sme (#137624 ) The "target-features" function attribute is not currently considered when adding vscale_range to a function. When +sve/+sme are pushed onto functions with "#pragma attribute push(+sve/+sme)", the function potentially misses out on optimizations that rely on vscale_range being present.	2025-05-16 09:39:07 +01:00
Kazu Hirata	0f0fd6213e	[Basic] Use std::optional::value_or (NFC) (#140172 )	2025-05-15 23:28:57 -07:00
Sean Fertile	4ac8e90df7	[PPC] Disable rop-protect for 32-bit OS targets. (#139619 ) The instructions are not supported on either 32-bit ELF (due to no redzone) or 32-bit AIX due to the instructions always using the full 64-bit width of the register inputs.	2025-05-14 13:00:15 -04:00
Prabhu Rajasekaran	20d6375796	[clang] Handle CC attrs for UEFI (#138935 ) UEFI's default ABI is MS ABI. Handle the calling convention attributes accordingly.	2025-05-07 21:42:01 -07:00
Tomohiro Kashiwada	eb6d51a2fd	[Cygwin] Enable TLS on Cygwin target (#138618 ) Cygwin environment and toolchain supports EMUTLS. From https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=config/tls.m4;hb=HEAD#l118, ``` $ LANG=C gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-cygwin/15/lto-wrapper.exe Target: x86_64-pc-cygwin Configured with: (snip) Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.0.1 20250406 (experimental) (GCC) $ echo '__thread int a; int b; int main() { return a = b; }' \| gcc -S -xc -o- - \| grep __emutls_get_address call __emutls_get_address .def __emutls_get_address; .scl 2; .type 32; .endef ```	2025-05-06 23:38:18 +03:00
Justin Cai	faf4e8af74	[Clang][SYCL] Add initial set of Intel OffloadArch values (#138158 ) Following #137070, this PR adds an initial set of Intel `OffloadArch` values with corresponding predicates that will be used in SYCL offloading. More Intel architectures will be added in a future PR.	2025-05-01 16:29:48 -05:00
Prabhu Rajasekaran	6274cdb9a6	[clang] Fix UEFI Target info (#127290 ) For X64 UEFI targets set appropriate integer type sizes, and relevant ABI information. --------- Co-authored-by: Petr Hosek <phosek@google.com>	2025-04-30 10:07:57 -07:00
Fraser Cormack	1e31f4b5eb	[AMDGPU] Support the OpenCL generic addrspace feature by default (#137636 ) This feature should be supported on AMDGCN architectures with flat addressing.	2025-04-29 14:14:00 +01:00
Nick Sarnie	f9ee5ce605	[clang][SPIR-V] Fix OpenCL addrspace mapping when using non-zero default AS (#137187 ) Based on feedback from https://github.com/llvm/llvm-project/pull/136753, remove the dummy values for OpenCL and make them match the zero default AS map. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-28 19:26:01 +00:00
Rainer Orth	e71c8ea3cc	[Driver] Fix _XOPEN_SOURCE definition on Solaris (#137141 ) Since commit 613a077b05b8352a48695be295037306f5fca151, `flang` doesn't build any longer on Solaris/amd64: ``` flang/lib/Evaluate/intrinsics-library.cpp:225:26: error: address of overloaded function 'acos' does not match required type '__float128 (__float128)' 225 \| FolderFactory<F, F{std::acos}>::Create("acos"), \| ^~~~~~~~~ ``` That patch led to the version of `quadmath.h` deep inside `/usr/gcc/<N>` to be found, thus `HAS_QUADMATHLIB` is defined. However, the `struct HostRuntimeLibrary<__float128, LibraryVersion::Libm>` template is guarded by `_POSIX_C_SOURCE >= 200112L \|\| _XOPEN_SOURCE >= 600`, while `clang` only predefines `_XOPEN_SOURCE=500`. This code dates back to commit 0c1941cb055fcf008e17faa6605969673211bea3 back in 2012. Currently, this is long obsolete and `gcc` prefefines `_XOPEN_SOURCE=600` instead since GCC 4.6 back in 2011. This patch follows that. Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.	2025-04-26 17:06:04 +02:00
Nick Sarnie	52a96491e1	[clang][SPIR-V] Addrspace of opencl_global should always be 1 (#136753 ) This fixes a CUDA SPIR-V regression introduced in https://github.com/llvm/llvm-project/pull/134399. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-24 14:20:13 +00:00
Ben Shi	b0524f3329	[clang][AVR] Improve compatibility of inline assembly with avr-gcc (#136534 ) Allow the value 64 to be round up to 0 for constraint 'I'.	2025-04-23 18:42:07 +08:00
modiking	d6a68be7af	[NVPTX] Add support for Shared Cluster Memory address space [1/2] (#135444 ) Adds support for new Shared Cluster Memory Address Space (SHARED_CLUSTER, addrspace 7). See https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#distributed-shared-memory for details. 1. Update address space structures and datalayout to contain the new space 2. Add new intrinsics that use this new address space 3. Update NVPTX alias analysis The existing intrinsics are updated in https://github.com/llvm/llvm-project/pull/136768	2025-04-22 15:14:39 -07:00
Steven Perron	c073c22865	[HLSL] Use hlsl_device address space for getpointer. (#127675 ) We add the hlsl_device address space to represent the device memory space as defined in section 1.7.1.3 of the [HLSL spec](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf). Fixes https://github.com/llvm/llvm-project/issues/127075	2025-04-22 13:26:32 -04:00
Kazu Hirata	0ed1c9862d	[clang] llvm::append_range (NFC) (#136440 )	2025-04-19 10:37:25 -07:00
Jonas Paulsson	6d03f51f0c	[SystemZ] Add support for 16-bit floating point. (#109164 ) - _Float16 is now accepted by Clang. - The half IR type is fully handled by the backend. - These values are passed in FP registers and converted to/from float around each operation. - Compiler-rt conversion functions are now built for s390x including the missing extendhfdf2 which was added. Fixes #50374	2025-04-16 20:02:56 +02:00
yingopq	1ee8fe810f	[Mips] Fix clang compile error when -march=p5600 with -mmsa (#132679 ) When -march=p5600 with -mmsa, the result of getISARev is 0, so report error. Append p5600 to cases mips32r5. Fix #91948.	2025-04-15 17:01:36 +08:00
Ulrich Weigand	80267f8148	Support z17 processor name and scheduler description (#135254 ) The recently announced IBM z17 processor implements the architecture already supported as "arch15" in LLVM. This patch adds support for "z17" as an alternate architecture name for arch15. This patch also add the scheduler description for the z17 processor, provided by Jonas Paulsson.	2025-04-11 00:20:58 +02:00
Nathan Gauër	a625bc60e2	[HLSL][SPIR-V] Add hlsl_private address space for SPIR-V (#133464 ) This is an alternative to https://github.com/llvm/llvm-project/pull/122103 In SPIR-V, private global variables have the Private storage class. This PR adds a new address space which allows frontend to emit variable with this storage class when targeting this backend. This is covered in this proposal: llvm/wg-hlsl@4c9e11a This PR will cause addrspacecast to show up in several cases, like class member functions or assignment. Those will have to be handled in the backend later on, particularly to fixup pointer storage classes in some functions. Before this change, global variable were emitted with the 'Function' storage class, which was wrong.	2025-04-10 10:55:10 +02:00
Nick Sarnie	68ee56d150	[clang][OpenMP][SPIR-V] Fix addrspace of global constants (#134399 ) SPIR-V has strict address space rules, constant globals cannot be in the default address space. The OMPIRBuilder change was required for lit tests to pass, we were missing an addrspacecast. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-09 15:41:53 +00:00
Farzon Lotfi	16c84c4475	[DirectX] Add target builtins (#134439 ) - fixes #132303 - Moves dot2add from a language builtin to a target builtin. - Sets the scaffolding for Sema checks for DX builtins - Setup DirectX backend as able to have target builtins - Adds a DX TargetBuiltins emitter in `clang/lib/CodeGen/TargetBuiltins/DirectX.cpp`	2025-04-07 12:06:57 -04:00
Juan Manuel Martinez Caamaño	beae0e9f1a	[AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (#133055 ) This patch introduces the `vmem-to-lds-load-insts` target feature, which can be used to enable builtins `__builtin_amdgcn_global_load_lds` and `__builtin_amdgcn_raw_ptr_buffer_load_lds` on platforms which have this feature. This feature is only available on gfx9/10. A limitation of using a common target feature for both builtins is that we could have made `__builtin_amdgcn_raw_ptr_buffer_load_lds` available on gfx6,7,8.	2025-04-02 20:00:09 +02:00
Cassandra Beckley	9ce77255b9	[HLSL] Add __spirv__ macro (#132848 ) This macro can be used by HLSL code to detect that it is being compiled for the SPIR-V target.	2025-03-28 10:49:19 -04:00
Mallikarjuna Gouda	0ca10ef51b	[MIPS] Add MIPS i6400 and i6500 processors (#130587 ) The i6400 and i6500 are high performance multi-core microprocessors from MIPS that provide best in class power efficiency for use in system-on-chip (SoC) applications. i6400 and i6500 implements Release 6 of the MIPS64 Instruction Set Architecture with full hardware multithreading and hardware virtualization support.	2025-03-20 23:08:33 -04:00
Shilei Tian	dccc0a836c	[NFC][AMDGPU] Replace more direct arch comparison with isAMDGCN() (#131379 ) This is an extension of #131357. Hopefully this would be the last one.	2025-03-14 17:02:15 -04:00
zhijian lin	737a0aeb6b	[NFC][PowerPC] cleaned dead code of PPC.cpp and PPC.h (#130994 ) There are some variables in the PPC.h which are defined and assigned a value to them, but never be used, remove the code related to the variables.	2025-03-14 09:24:44 -04:00
Hubert Tong	e0e80dbe43	[Clang codegen][PPC] Produce AIX-specific "target features" only for AIX (#130864 ) Listing AIX-specific "target features" in the IR are a source of confusion on PPC Linux. Generate them only for AIX (at least by default).	2025-03-13 18:13:03 -04:00
Nick Sarnie	7a5e4f5405	[clang][NFCI] Fix getGridValues for unsupported targets (#131023 ) I broke this in `f3cd223838`, I should have added this to the `SPIRV64` subclass, but I accidentally added it to base `TargetInfo`. Using an unsupported target should error in the driver way before this though. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-03-13 14:28:49 +00:00
A. Jiang	6abe19ac58	[clang] Predefine `_CRT_USE_BUILTIN_OFFSETOF` in MS-compatible modes (#127568 ) This patch makes Clang predefine `_CRT_USE_BUILTIN_OFFSETOF` in MS-compatible modes. The macro can make the `offsetof` provided by MS UCRT's `<stddef.h>` to select the `__builtin_offsetof` version, so with it Clang (Clang-cl) can directly consume UCRT's `offsetof`. MSVC predefines the macro as `1` since at least VS 2017 19.14, but I think it's also OK to define it in "older" compatible modes. Fixes #59689.	2025-03-13 14:02:44 +08:00
Prabhuk	45ca613c13	[clang] Use TargetInfo to decide Mangling for C (#129920 ) Instead of hardcoding the decision on what mangling scheme to use based on targets, use TargetInfo to make the decision.	2025-03-05 17:26:52 -08:00
Peilin Ye	17bfc00f7c	[BPF] Add load-acquire and store-release instructions under -mcpu=v4 (#108636 ) As discussed in [1], introduce BPF instructions with load-acquire and store-release semantics under -mcpu=v4. Define 2 new flags: BPF_LOAD_ACQ 0x100 BPF_STORE_REL 0x110 A "load-acquire" is a BPF_STX \| BPF_ATOMIC instruction with the 'imm' field set to BPF_LOAD_ACQ (0x100). Similarly, a "store-release" is a BPF_STX \| BPF_ATOMIC instruction with the 'imm' field set to BPF_STORE_REL (0x110). Unlike existing atomic read-modify-write operations that only support BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and store-releases also support BPF_B (8-bit) and BPF_H (16-bit). An 8- or 16-bit load-acquire zero-extends the value before writing it to a 32-bit register, just like ARM64 instruction LDAPRH and friends. As an example (assuming little-endian): long foo(long ptr) { return __atomic_load_n(ptr, __ATOMIC_ACQUIRE); } foo() can be compiled to: db 10 00 00 00 01 00 00 r0 = load_acquire((u64 )(r1 + 0x0)) 95 00 00 00 00 00 00 00 exit opcode (0xdb): BPF_ATOMIC \| BPF_DW \| BPF_STX imm (0x00000100): BPF_LOAD_ACQ Similarly: void bar(short ptr, short val) { __atomic_store_n(ptr, val, __ATOMIC_RELEASE); } bar() can be compiled to: cb 21 00 00 10 01 00 00 store_release((u16 )(r1 + 0x0), w2) 95 00 00 00 00 00 00 00 exit opcode (0xcb): BPF_ATOMIC \| BPF_H \| BPF_STX imm (0x00000110): BPF_STORE_REL Inline assembly is also supported. Add a pre-defined macro, __BPF_FEATURE_LOAD_ACQ_STORE_REL, to let developers detect this new feature. It can also be disabled using a new llc option, -disable-load-acq-store-rel. Using __ATOMIC_RELAXED for __atomic_store{,_n}() will generate a "plain" store (BPF_MEM \| BPF_STX) instruction: void foo(short ptr, short val) { __atomic_store_n(ptr, val, __ATOMIC_RELAXED); } 6b 21 00 00 00 00 00 00 (u16 )(r1 + 0x0) = w2 95 00 00 00 00 00 00 00 exit Similarly, using __ATOMIC_RELAXED for __atomic_load{,_n}() will generate a zero-extending, "plain" load (BPF_MEM \| BPF_LDX) instruction: int foo(char ptr) { return __atomic_load_n(ptr, __ATOMIC_RELAXED); } 71 11 00 00 00 00 00 00 w1 = (u8 )(r1 + 0x0) bc 10 08 00 00 00 00 00 w0 = (s8)w1 95 00 00 00 00 00 00 00 exit Currently __ATOMIC_CONSUME is an alias for __ATOMIC_ACQUIRE. Using __ATOMIC_SEQ_CST ("sequentially consistent") is not supported yet and will cause an error: $ clang --target=bpf -mcpu=v4 -c bar.c > /dev/null bar.c:1:5: error: sequentially consistent (seq_cst) atomic load/store is not supported 1 \| int foo(int ptr) { return __atomic_load_n(ptr, __ATOMIC_SEQ_CST); } \| ^ ... Finally, rename those isST() and isLD*() helper functions in BPFMISimplifyPatchable.cpp based on what the instructions actually do, rather than their instruction class. [1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/	2025-03-04 09:19:39 -08:00
Brandon Wu	c804e86f55	[RISCV][VLS] Support RISCV VLS calling convention (#100346 ) This patch adds a function attribute `riscv_vls_cc` for RISCV VLS calling convention which takes 0 or 1 argument, the argument is the `ABI_VLEN` which is the `VLEN` for passing the fixed-vector arguments, it wraps the argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the corresponding mechanism to handle it. The range of `ABI_VLEN` is [32, 65536], if not specified, the default value is 128. Here is an example of VLS argument passing: Non-VLS call: ``` void original_call(__attribute__((vector_size(16))) int arg) {} => define void @original_call(i128 noundef %arg) { entry: ... ret void } ``` VLS call: ``` void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {} => define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) { entry: ... ret void } } ``` The first Non-VLS call passes generic vector argument of 16 bytes by flattened integer. On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the vector to <vscale x 1 x i32> where the number of scalable vector elements is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`. Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4. PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418 C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68	2025-03-03 12:39:35 +08:00
Yaxun (Sam) Liu	240f2269ff	Add clang atomic control options and attribute (#114841 ) Add option and statement attribute for controlling emitting of target-specific metadata to atomicrmw instructions in IR. The RFC for this attribute and option is https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641, Originally a pragma was proposed, then it was changed to clang attribute. This attribute allows users to specify one, two, or all three options and must be applied to a compound statement. The attribute can also be nested, with inner attributes overriding the options specified by outer attributes or the target's default options. These options will then determine the target-specific metadata added to atomic instructions in the IR. In addition to the attribute, three new compiler options are introduced: `-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`, `-f[no-]atomic-ignore-denormal-mode`. These compiler options allow users to override the default options through the Clang driver and front end. `-m[no-]unsafe-fp-atomics` is aliased to `-f[no-]ignore-denormal-mode`. In terms of implementation, the atomic attribute is represented in the AST by the existing AttributedStmt, with minimal changes to AST and Sema. During code generation in Clang, the CodeGenModule maintains the current atomic options, which are used to emit the relevant metadata for atomic instructions. RAII is used to manage the saving and restoring of atomic options when entering and exiting nested AttributedStmt.	2025-02-27 10:41:04 -05:00

1 2 3 4 5 ...

1643 Commits