llvm-project

Author	SHA1	Message	Date
Marcos Maronas	ce94d63f0f	Make OpenCL an OSType rather than an EnvironmentType. (#170297 ) OpenCL was added as an `EnvironmentType` in https://github.com/llvm/llvm-project/pull/78655, but there is no explanation as to why it was added as such, even after explicitly asking in the PR (https://github.com/llvm/llvm-project/pull/78655#issuecomment-2743162853). This PR makes it an `OSType` instead, which feels more natural, and updates tests accordingly. --------- Co-authored-by: Marcos Maronas <marcos.maronas@intel.com>	2026-02-10 18:45:50 +00:00
Mirko Brkušanin	4280f0d241	[AMDGPU] Add dot4 fp8/bf8 instructions for gfx1170 (#180516 )	2026-02-10 12:14:49 +01:00
Ruoyu Qiu	da0ad392ff	[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling (#180468 ) Reland #174731, resolve cyclic dependency issue. The use of LLVM_Object in LLVM_Util would cause cyclic dependency. Fix cyclic dependency by reimplement `getFeatureSetFromEFlag()`. Original description: --- This PR updates llvm-objdump to detect the specific AVR architecture from the ELF header flags when no specific CPU is provided. Fixes: https://github.com/llvm/llvm-project/issues/146451 Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>	2026-02-09 21:10:14 +08:00
Mirko Brkušanin	45b037cf7a	[AMDGPU] Add fp8/bf8 conversion instructions for gfx1170 (#180191 )	2026-02-09 13:56:43 +01:00
Ganesh	a362593e0d	[X86] AMD Zen 6 Initial enablement (#179150 ) This patch adds initial support for AMD Zen 6 architecture (znver6): - Added znver6 CPU target recognition in Clang and LLVM - Updated compiler-rt CPU model detection for znver6 - Added znver6 to target parser and host CPU detection - Added znver6 to various optimizer tests znver6 features: FP16, AVXVNNIINT8, AVXNECONVERT, AVXIFMA (without BMM).	2026-02-07 09:38:10 +05:30
Henrik G. Olsson	eff21afae0	Revert "[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling" (#180252 ) Reverts llvm/llvm-project#174731 due to introducing a cyclic dependency when building LLVM with modules enabled: LLVM_Utils -> LLVM_Object -> LLVM_Utils	2026-02-06 19:00:32 +00:00
Mirko Brkušanin	20b5849e17	[AMDGPU] Define new target gfx1170 (#180185 )	2026-02-06 14:38:50 +01:00
Ruoyu Qiu	d005cb2953	[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling (#174731 ) This PR updates llvm-objdump to detect the specific AVR architecture from the ELF header flags when no specific CPU is provided. Fixes: #146451 --------- Signed-off-by: RuoyuQiu <cabbaken@outlook.com> Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com> Co-authored-by: qiuruoyu <qiuruoyu@hygon.cn>	2026-02-06 08:58:12 +08:00
Min-Yih Hsu	6441f1c9d5	[RISCV] Introduce a new syntax for processor-specific tuning feature strings (#175063 ) This patch proposes new a tuning feature string format that helps users to build a performance model by "configuring" an existing tune CPU, along with its scheduling model. For example, this string ``` "sifive-x280:single-element-vec-fp64" ``` takes ``sifive-x280`` as the "base" tune CPU and configured it with ``single-element-vec-fp64``. This gives us a performance model that looks exactly like that of ``sifive-x280``, except some of the 64-bit vector floating point instructions now produce only a single element per cycle due to ``single-element-vec-fp64``. This string could eventually be used in places like ``-mtune`` at the frontend. Right now, this patch only implements the parser part, which is put under the TargetParser library. The grammar for this string is: ``` tune-cpu ::= 'tuning CPU name in lower case' directive ::= "[a-zA-Z0-9_-]+" tune-features ::= directive ["," directive]* ``` A directive can and can only _enable_ or _disable_ a certain tuning feature from the tuning CPU. A positive directive, like the ``single-element-vec-fp64`` we just saw, enables an additional tuning feature in the associated tuning model. A negative directive, on the other hand, removes a certain tuning feature. For example, ``sifive-x390`` already has the ``single-element-vec-fp64`` feature, and we can use "sifive-x390:no-single-element-vec-fp64" to create a new performance model that looks nearly the same as ``sifive-x390`` except ``single-element-vec-fp64`` being cut out. In this case, ``no-single-element-vec-fp64`` is a negative directive. There are additional restrictions on what we can put in the list of directives, please refer to the documentations for more details. Right now, this string only accepts directives that are explicitly supported by the tune CPU. For example, "sifive-x280:prefer-w-inst" is not a valide string as ``prefer-w-inst`` is not supported by ``sifive-x280`` at this moment. Vendors of these processors are expected to maintain the compatibility of their supported directives across different versions. --------- Co-authored-by: Sam Elliott <aelliott@qti.qualcomm.com>	2026-02-05 15:22:07 -08:00
Ian Anderson	639a8d1f1d	[Triple] Make a target triple "os" for firmware (#176272 ) Make a Triple::OSType to support a generic "firmware" OS that isn't bare metal, but isn't tied to a specific hardware platform like macOS or iOS. Hook up support for the new OSType in the Darwin toolchain.	2026-02-04 12:15:25 -08:00
Phoebe Wang	2f3935bcee	[X86][APX] Disable PP2/PPX generation on Windows (#178122 ) The PUSH2/POP2/PPX instructions for APX require updates to the Microsoft Windows OS x64 calling convention documented at https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170 due to lack of suitable unwinder opcodes that can support APX PUSH2/POP2/PPX. The PR request disables this support by default for code robustness; workloads that choose to explicitly enable this support can change the default behavior by explicitly specifying the flag options that enable this support e.g. for experimentation or code paths that do not need unwinder support.	2026-02-02 18:01:44 +08:00
Mariusz Sikora	6de6f7b46b	[AMDGPU] Define gfx1310 target with ELF number 0x50 (#177355 ) For now this is identical to gfx1250. --------- Co-authored-by: Jay Foad <jay.foad@amd.com>	2026-01-22 17:08:38 +01:00
Nikita Popov	4fe4f23e2f	[TargetParser] Fix fp16 feature name for ARM64 Windows feature detection (#176925 ) The feature is called fullfp16, not fp16, see: `979db00b9a/llvm/lib/Target/AArch64/AArch64Features.td (L142)`	2026-01-22 09:23:10 +01:00
Jonas Paulsson	8eccda10d2	[SystemZ] Add SP alignment to the DataLayout string. (#176041 ) Add '-S64' to the SystemZ datalayout string, to avoid overalignment of stack objects. Fixes #173402	2026-01-20 09:54:47 -06:00
Ricardo Jesus	9458d2a0f4	[AArch64][Driver] Allow runtime detection to override default features. (#176340 ) Currently, most extensions controlled through -march and -mcpu options are handled in a bitset of AArch64::ExtensionSet. However, extensions detected at runtime for native compilation are handled in a separate list of CPU features; once most of the parsing logic has run, the bitset is converted to a feature list, added after the features detected at runtime, and the resulting list is used from there on out. This has the downside that runtime-detected features are unable to override default CPU extensions. For example, if a CPU enables +aes in its processor definition, but aes support is not detected at runtime, the feature currently remains enabled---even though unsupported---because default features are enabled after the runtime logic attempts to disable them. This patch inserts runtime-detected features directly into the extension set such that these options can take precedence over extensions enabled by default. The general parsing order for mcpu=native becomes: 1. CPU defaults; 2. Runtime detection; 3. +featureA+nofeatureB options; 4. Other parsing decisions. This allows features that are found to be unsupported at runtime to be removed from the list of features supported by targets that enable them by default. While at it, this also disables rng if not detected at runtime.	2026-01-20 13:09:17 +00:00
Shilei Tian	39bd4562ba	[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly (#176599 ) We should not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time. We should also not allow `-wavefrontsize32` on a target that only supports `wavefrontsize32`, and the vice versa.	2026-01-19 18:16:29 -05:00
hev	0a9d480fad	[clang][LoongArch] Add support for LoongArch32 (#172619 ) This patch adds support for LoongArch32, as introduced in la-toolchain-conventions v1.2. Co-authored-by: Sun Haiyong <sunhaiyong@zdbr.net> Link: https://github.com/loongson/la-toolchain-conventions/releases/tag/releases%2Fv1.2 Link: https://gcc.gnu.org/pipermail/gcc-patches/2025-December/703312.html	2026-01-17 16:27:54 +08:00
Henry Linjamäki	9587892183	[Triple] Add "chipstar" OS components (#170655 ) This new component is for Clang driver for selecting HIPSPV toolchain.	2026-01-16 08:24:58 -06:00
Shoreshen	26624d51d1	[AMDGPU]Add specific instruction feature for multicast load (#175503 )	2026-01-13 09:10:09 +08:00
Philipp Tomsich	43138d6272	[Aarch64] Add support for Ampere1C core (#175442 ) This patch adds initial support for the ARMv9.2+ Ampere1C core.	2026-01-12 09:52:23 +01:00
Dan Gohman	597ffbe09d	Rename wasm32-wasi to wasm32-wasip1. (#165345 ) This adds code to recognize "wasm32-wasip1", "wasm32-wasip2", and "wasm32-wasip3" as explicit targets, and adds a deprecation warning when the "wasm32-wasi" target is used, pointing users to the "wasm32-wasip1" target. Fixes #165344. I'm filing this as a draft PR for now, as I've only just now proposed to make this change in #165344.	2026-01-10 00:09:06 +00:00
Craig Topper	bafbf2d58d	[RISCV] Add rules for Zca+Zcb+Zcmp+Zcmpt implying Zce. (#175041 ) The implication rules need to consider whether F is enabled like was done for C in #172860.	2026-01-08 20:07:02 -08:00
Francesco Petrogalli	75d025124a	[RISCV] Add basic Mach-O triple support. (#141682 ) Based on a patch written by Tim Northover (https://github.com/TNorthover).	2026-01-05 23:18:48 +00:00
Jerry Zhang Jian	fc69c804db	[RISCV] Implement conditional Zca implies C extension rule (#172860 ) This change implements the conditional "Zca implies C" rule to match GCC's behavior (PR119122) and the RISC-V specification for MISA.C. The rule is: - For RV32: - No F and no D: Zca alone implies C - F but no D: Zca + Zcf implies C - F and D: Zca + Zcf + Zcd implies C - For RV64: - No D: Zca alone implies C - D: Zca + Zcd implies C This fixes multilib matching issues where LLVM-generated march strings didn't include the C extension when GCC's multilib configurations expected it. Reference: - GCC PR119122: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119122 - RISC-V Zc spec: https://github.com/riscv/riscv-isa-manual/blob/main/src/zc.adoc Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>	2025-12-20 01:15:03 +08:00
Sudharsan Veeravalli	3bf0a8d6e1	[RISCV] Add Xqci feature flag (#172608 ) This patch adds an experimental Xqci feature flag that covers all the sub-extensions in the Qualcomm uC Extension.	2025-12-18 21:32:49 +05:30
Phoebe Wang	d6c2cd69cb	[X86][APX] Check APXSave before enabling APX features (#172834 ) According to APX spec 3.1.4.2, APX instructions can normally execute only when XCR0[APX_F]=1, where APX_F=19.	2025-12-18 22:22:20 +08:00
Zachary Yedidia	2c05ae4b8f	[LFI] Introduce AArch64 LFI Target (#167061 ) This PR is the first step towards introducing LFI into LLVM as a new sub-architecture backend of AArch64. For details, please see the [RFC](https://discourse.llvm.org/t/rfc-lightweight-fault-isolation-lfi-efficient-native-code-sandboxing-upstream-lfi-target-and-compiler-changes/88380), which has been approved for AArch64. This patch creates the `aarch64_lfi` architecture, and marks the appropriate registers as reserved when it is targeted (`x25`, `x26`, `x27`, `x28`). It also adds a Clang driver toolchain for targeting LFI, and updates the compiler-rt CMake to allow builds for the `aarch64_lfi` target. The patch also includes documentation for LFI and the rewrites that will be implemented in future patches. I am planning to split the relevant modifications for LFI into a series of patches, organized as described below (after this one). Please let me know if you'd like me to split the changes in a different way, or provide one big patch. 1. The next patch will introduce the `MCLFIExpander` mechanism for applying the MC-level rewrites needed by LFI, along with the `.lfi_expand` and `.lfi_no_expand` assembly directives when targeting LFI. A preview can be seen on the `lfi-project` [fork](https://github.com/llvm/llvm-project/compare/main...lfi-project:llvm-project:lfi-patchset/aarch64-pr-2). 2. The following patch will create an `MCLFIExpander` for the AArch64 backend that performs LFI expansions. This patch will contain the majority of the LFI-specific logic. 3. The final patch will add an optimization to the rewriter that can eliminate redundant guard instructions that occur within the same basic block. We plan to introduce x86-64 support after further discussion and once the `MCLFIExpander` infrastructure is in place. Please let me know your feedback, and thank you very much for your help and guidance in the review process.	2025-12-16 12:51:02 -08:00
dcandler	23f967ada0	[AArch64] Add support for C1 CPUs (#171124 ) This patch adds initial support for the Arm v9.3 C1 processors: * C1-Nano * C1-Pro * C1-Premium * C1-Ultra For more information on each, see: https://developer.arm.com/Processors/C1-Nano https://developer.arm.com/Processors/C1-Pro https://developer.arm.com/Processors/C1-Premium https://developer.arm.com/Processors/C1-Ultra Technical Reference Manual for C1-Nano: https://developer.arm.com/documentation/107753/latest/ Technical Reference Manual for C1-Pro: https://developer.arm.com/documentation/107771/latest/ Technical Reference Manual for C1-Premium: https://developer.arm.com/documentation/109416/latest/ Technical Reference Manual for C1-Ultra: https://developer.arm.com/documentation/108014/latest/	2025-12-16 14:54:27 +00:00
Mikołaj Piróg	b6f210b215	[X86] Correct CPUID checks for AVX10 (#172350 ) This corrects a wrong condition for avx10 (AVX10Ver is always set to 0/1) and corrects how CPUID for avx10 is queried: per ISE table 1-3 we should query with EAX = 0x24 and ECX = 0x0 -- previously we omitted the latter. Issue reported by user Seraphimt here https://discourse.llvm.org/t/test-for-sys-gethostcpufeatures/89130	2025-12-16 13:59:50 +01:00
Eli Friedman	1b4a74fcdc	[AArch64] Fix typo in 09e57cfd32b0073b63d568835f07251e0d51affb (#172354 )	2025-12-15 11:15:59 -08:00
Eli Friedman	09e57cfd32	[AArch64] Extend Windows CPU feature detection with more features. (#171930 ) Mostly adding feature flags from the newest SDK. (Note that in addition to the obvious, this also affects the compiler-rt SME ABI routines, which rely on FEAT_SME and FEAT_SME2.)	2025-12-15 10:56:17 -08:00
Nikita Popov	b7c0452a9a	[PowerPC][AIX] Specify correct ABI alignment for double (#144673 ) Add `f64:32:64` to the data layout for AIX, to indicate that doubles have a 32-bit ABI alignment and 64-bit preferred alignment. Clang was already taking this into account, but it was not reflected in LLVM's data layout. A notable effect of this change is that `double` loads/stores with 4 byte alignment are no longer considered "unaligned" and avoid the corresponding unaligned access legalization. I assume that this is correct/desired for AIX. (The codegen previously already relied on this in some places related to the call ABI simply by dint of assuming certain stack locations were 8 byte aligned, even though they were only actually 4 byte aligned.) Fixes https://github.com/llvm/llvm-project/issues/133599.	2025-12-11 08:57:26 +01:00
Mirko Brkušanin	5759a3a779	[AMDGPU] Add s_wakeup_barrier instruction for gfx1250 (#170501 )	2025-12-10 09:45:13 +01:00
Craig Topper	d18cdc99bc	[RISCVInsertVSETVLI] Don't allow getSEW/getLMUL to be called for hasSEWLMULRatioOnly(). NFC (#171554 ) Refactor some logic in transferBefore to handle hasSEWLMULRatioOnly() before calling getSEW/getLMUL. Update adjustIncoming to use getSEWLMULRatio(). Update the interface of RISCVVType::getSameRatioLMUL to take the ratio instead of SEW and LMUL. Update the few other callers to call RISCVVType::getSEWLMULRatio first.	2025-12-09 22:06:15 -08:00
Alexandros Lamprineas	1b82c16fa8	[FMV][AArch64] Allow user to override version priority. (#150267 ) Implements https://github.com/ARM-software/acle/pull/404 This allows the user to specify "featA+featB;priority=[1-255]" where priority=255 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1. Internally this gets expanded using special FMV features P0 ... P7 which can encode up to 256-1 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions.	2025-12-09 13:31:10 +00:00
Nikita Popov	9dc3255cb9	[Clang] Use DataLayout from TargetParser (#171135 ) This switches clang to use the data layouts from TargetParser, instead of maintaining its own copy of data layouts, which are required to match the backend data layouts. For now I've kept explicit calls to resetDataLayout(), just with the argument implied by the triple and ABI. Ideally this would happen automatically, but the way these classes are initialized currently doesn't offer a great place to do this. Previously resetDataLayout() also set the UserLabelPrefix. I've separated this out, with a reasonable default so that most targets don't need to worry about it. I've kept the explicit data layouts for TCE and SPIR (without the V). These seem to not correspond to real LLVM targets. I've also fixed the XCore data layout in TargetParser, which was incorrectly set to the same one as Xtensa. It was previously unused.	2025-12-09 07:42:02 +00:00
Mikołaj Piróg	e3044cd552	[X86] Sync multiversion features with libgcc and refactor internal feature tables (#168750 ) Compiler-rt internal feature table is synced with the one in libgcc (common/config/i386/i386-cpuinfo.h). LLVM internal feature table is refactored to include a field ABI_VALUE, so we won't be relying on ordering to keep the values correct. The table is also synced to the one in compiler-rt.	2025-11-27 15:29:16 +01:00
Eli Friedman	590bb3e8e6	[AArch64] Improve host feature detection. (#160410 ) SVE depends on a combination of host support and operating system support. Sometimes those don't line up with detected host CPU name; make sure SVE is disabled when it isn't available. Implement this for both Windows and Linux. (We don't have a codepath for other operating systems. If someone wants to implement this, it should be possible to adapt fmv code from compiler-rt.) While I'm here, also add support for detecting other Windows CPU features. For Windows, declare constants ourselves so the code builds on older SDKs; we also do this in compiler-rt.	2025-11-24 14:08:50 -08:00
Shoreshen	52a58a4193	[AMDGPU] Adding instruction specific features (#167809 )	2025-11-19 11:06:00 +08:00
Kazu Hirata	99bf41cd11	[TargetParser] Use range-based for loops (#168296 ) While I am at it, this patch converts one of the loops to use llvm::is_contained. Identified with modernize-loop-convert.	2025-11-17 07:59:45 -08:00
Mikołaj Piróg	b6fd3c62bb	[X86] Enable APX and AVX10.2 on NVL (#168061 ) Per Intel Architecture Instruction Set Extensions Programming Reference rev. 60 (https://cdrdv2.intel.com/v1/dl/getContent/671368), table 1-2, NVL supports APX and AVX10.2	2025-11-17 15:46:58 +01:00
Kazu Hirata	2394eb1180	[TargetParser] Avoid repeated hash lookups (NFC) (#168216 )	2025-11-16 08:08:39 -08:00
Mikołaj Piróg	8f6c7aa2b1	[X86] Remove vector length (256 vs 512) distinction of AVX10 (#167736 ) As in title. AVX10.x doesn't distinguish between available vector lengths. -mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility. Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser are synced to match those in the gcc.	2025-11-15 15:51:06 +01:00
serge-sans-paille	04b05998b1	Remove unused <array> and <list> inclusion (#167116 )	2025-11-09 15:15:10 +00:00
Walter Lee	0902a6b8de	Add missing #include (fix for #166997 )	2025-11-08 16:37:31 -05:00
Amit Kumar Pandey	36d477850f	[ASan] Skip explicit check of 'xnack' feature for gfx1250 && gfx1251. (#166754 ) Xnack processing is essential and performed at the frontend to enable ASan instrumentation for AMDGPU device code. Certain AMDGPU subtargets like gfx1250 && gfx1251 don't have to enable 'xnack+' explictly in '--offload-arch=' for device ASan instrumentation.	2025-11-06 21:42:42 +05:30
Jakub Kuderski	4c21d0cb14	[ADT] Prepare to deprecate variadic `StringSwitch::Cases`. NFC. (#166020 ) Update all uses of variadic `.Cases` to use the initializer list overload instead. I plan to mark variadic `.Cases` as deprecated in a followup PR. For more context, see https://github.com/llvm/llvm-project/pull/163117.	2025-11-02 00:12:33 +00:00
Mikołaj Piróg	5322fb6268	[X86] Remove AMX-TRANSPOSE (#165556 ) Per Intel Architecture Instruction Set Extensions Programming Reference rev. 59 (https://cdrdv2.intel.com/v1/dl/getContent/671368), Revision History entry for revision -59, AMX-TRANSPOSE was removed	2025-10-31 12:50:21 +01:00
Jens Reidel	331b3eb489	[PowerPC] Take ABI into account for data layout (#149725 ) Prior to this change, the data layout calculation would not account for explicitly set `-mabi=elfv2` on `powerpc64-unknown-linux-gnu`, a target that defaults to `elfv1`. This is loosely inspired by the equivalent ARM / RISC-V code. `make check-llvm` passes fine for me, though AFAICT all the tests specify the data layout manually so there isn't really a test for this and I am not really sure what the best way to go about adding one would be. Signed-off-by: Jens Reidel <adrian@travitia.xyz>	2025-10-31 10:30:53 +01:00
Kazu Hirata	817aff6960	[llvm] Use nullptr instead of 0 or NULL (NFC) (#165396 ) Identified with modernize-use-nullptr.	2025-10-28 16:15:01 -07:00

1 2 3 4 5 ...

565 Commits