129 Commits

Author SHA1 Message Date
Phoebe Wang
3fadb15afa
[X86][APX] Combine MOVABS+JMP to JMPABS when in no-PIC large code model (#186402) 2026-03-16 16:16:27 +08:00
Ganesh
a362593e0d
[X86] AMD Zen 6 Initial enablement (#179150)
This patch adds initial support for AMD Zen 6 architecture (znver6):

- Added znver6 CPU target recognition in Clang and LLVM
- Updated compiler-rt CPU model detection for znver6
- Added znver6 to target parser and host CPU detection
- Added znver6 to various optimizer tests

znver6 features: FP16, AVXVNNIINT8, AVXNECONVERT, AVXIFMA (without BMM).
2026-02-07 09:38:10 +05:30
Phoebe Wang
2f3935bcee
[X86][APX] Disable PP2/PPX generation on Windows (#178122)
The PUSH2/POP2/PPX instructions for APX require updates to the Microsoft
Windows OS x64 calling convention documented at
https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170
due to lack of suitable unwinder opcodes that can support APX
PUSH2/POP2/PPX.

The PR request disables this support by default for code robustness;
workloads that choose to explicitly enable this support can change the
default behavior by explicitly specifying the flag options that enable
this support e.g. for experimentation or code paths that do not need
unwinder support.
2026-02-02 18:01:44 +08:00
Nikita Popov
4fe4f23e2f
[TargetParser] Fix fp16 feature name for ARM64 Windows feature detection (#176925)
The feature is called fullfp16, not fp16, see:
979db00b9a/llvm/lib/Target/AArch64/AArch64Features.td (L142)
2026-01-22 09:23:10 +01:00
Ricardo Jesus
9458d2a0f4
[AArch64][Driver] Allow runtime detection to override default features. (#176340)
Currently, most extensions controlled through -march and -mcpu options
are handled in a bitset of AArch64::ExtensionSet. However, extensions
detected at runtime for native compilation are handled in a separate
list of CPU features; once most of the parsing logic has run, the bitset
is converted to a feature list, added after the features detected at
runtime, and the resulting list is used from there on out.

This has the downside that runtime-detected features are unable to
override default CPU extensions. For example, if a CPU enables +aes in
its processor definition, but aes support is not detected at runtime,
the feature currently remains enabled---even though
unsupported---because default features are enabled after the runtime
logic attempts to disable them.

This patch inserts runtime-detected features directly into the extension
set such that these options can take precedence over extensions enabled
by default. The general parsing order for mcpu=native becomes:
1. CPU defaults;
2. Runtime detection;
3. +featureA+nofeatureB options;
4. Other parsing decisions.

This allows features that are found to be unsupported at runtime to be
removed from the list of features supported by targets that enable them
by default.

While at it, this also disables rng if not detected at runtime.
2026-01-20 13:09:17 +00:00
Philipp Tomsich
43138d6272
[Aarch64] Add support for Ampere1C core (#175442)
This patch adds initial support for the ARMv9.2+ Ampere1C core.
2026-01-12 09:52:23 +01:00
Phoebe Wang
d6c2cd69cb
[X86][APX] Check APXSave before enabling APX features (#172834)
According to APX spec 3.1.4.2, APX instructions can normally execute
only when XCR0[APX_F]=1, where APX_F=19.
2025-12-18 22:22:20 +08:00
dcandler
23f967ada0
[AArch64] Add support for C1 CPUs (#171124)
This patch adds initial support for the Arm v9.3 C1 processors:
* C1-Nano
* C1-Pro
* C1-Premium
* C1-Ultra

For more information on each, see:
https://developer.arm.com/Processors/C1-Nano
https://developer.arm.com/Processors/C1-Pro
https://developer.arm.com/Processors/C1-Premium
https://developer.arm.com/Processors/C1-Ultra

Technical Reference Manual for C1-Nano:
https://developer.arm.com/documentation/107753/latest/

Technical Reference Manual for C1-Pro:
https://developer.arm.com/documentation/107771/latest/

Technical Reference Manual for C1-Premium:
https://developer.arm.com/documentation/109416/latest/

Technical Reference Manual for C1-Ultra:
https://developer.arm.com/documentation/108014/latest/
2025-12-16 14:54:27 +00:00
Mikołaj Piróg
b6f210b215
[X86] Correct CPUID checks for AVX10 (#172350)
This corrects a wrong condition for avx10 (AVX10Ver is always set to
0/1) and corrects how CPUID for avx10 is queried: per ISE table 1-3 we
should query with EAX = 0x24 and ECX = 0x0 -- previously we omitted the
latter.

Issue reported by user Seraphimt here
https://discourse.llvm.org/t/test-for-sys-gethostcpufeatures/89130
2025-12-16 13:59:50 +01:00
Eli Friedman
1b4a74fcdc
[AArch64] Fix typo in 09e57cfd32b0073b63d568835f07251e0d51affb (#172354) 2025-12-15 11:15:59 -08:00
Eli Friedman
09e57cfd32
[AArch64] Extend Windows CPU feature detection with more features. (#171930)
Mostly adding feature flags from the newest SDK.

(Note that in addition to the obvious, this also affects the compiler-rt
SME ABI routines, which rely on FEAT_SME and FEAT_SME2.)
2025-12-15 10:56:17 -08:00
Eli Friedman
590bb3e8e6
[AArch64] Improve host feature detection. (#160410)
SVE depends on a combination of host support and operating system
support. Sometimes those don't line up with detected host CPU name; make
sure SVE is disabled when it isn't available. Implement this for both
Windows and Linux. (We don't have a codepath for other operating
systems. If someone wants to implement this, it should be possible to
adapt fmv code from compiler-rt.)

While I'm here, also add support for detecting other Windows CPU
features.

For Windows, declare constants ourselves so the code builds on older
SDKs; we also do this in compiler-rt.
2025-11-24 14:08:50 -08:00
Kazu Hirata
99bf41cd11
[TargetParser] Use range-based for loops (#168296)
While I am at it, this patch converts one of the loops to use
llvm::is_contained.

Identified with modernize-loop-convert.
2025-11-17 07:59:45 -08:00
Mikołaj Piróg
5322fb6268
[X86] Remove AMX-TRANSPOSE (#165556)
Per Intel Architecture Instruction Set Extensions Programming Reference
rev. 59 (https://cdrdv2.intel.com/v1/dl/getContent/671368), Revision
History entry for revision -59, AMX-TRANSPOSE was removed
2025-10-31 12:50:21 +01:00
Kazu Hirata
817aff6960
[llvm] Use nullptr instead of 0 or NULL (NFC) (#165396)
Identified with modernize-use-nullptr.
2025-10-28 16:15:01 -07:00
Albert Huang
aa550cdc5f
[ARM] [AArch32] Add support for Arm China STAR-MC3 CPU (#163709)
STAR-MC3 is an Armv8.1m CPU.
Technical specificationa available at:
https://www.armchina.com/download/Documents/TRM?infoId=240
2025-10-27 08:55:28 +00:00
Mikołaj Piróg
22a2a82054
[X86] Add support for Nova Lake (#163552)
Add support for Nova Lake, per Intel Architecture Instruction Set
Extensions Programming Reference rev. 59
(https://cdrdv2.intel.com/v1/dl/getContent/671368)
2025-10-16 14:58:23 +02:00
Kazu Hirata
f2306b6304
[llvm] Replace LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]] (NFC) (#163507)
This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]].  Note
that this patch adjusts the placement of [[maybe_unused]] to comply
with the C++17 language.
2025-10-15 06:54:14 -07:00
Mikołaj Piróg
0e6557d71c
[X86] Add support for Wildcat Lake (#163214)
Add support for Wildcat Lake, per Intel Architecture Instruction Set
Extensions Programming Reference rev. 59
(https://cdrdv2.intel.com/v1/dl/getContent/671368)
2025-10-15 10:36:20 +02:00
Umesh Kalvakuntla
b3a1c77782
[X86] Fixes for AMD znver5 enablement (#159237)
- cpuid bit for prefetchi is different from Intel
(https://docs.amd.com/v/u/en-US/24594_3.37)
 - Fix cpu family model numbers
2025-09-17 13:25:45 +08:00
Phoebe Wang
94b164c218
[X86][AVX10] Remove EVEX512 and AVX10-256 implementations (#157034)
The 256-bit maximum vector register size control was removed from AVX10
whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343

We have warned these options in LLVM21 through #132542. This patch
removes underlying implementations in LLVM22.
2025-09-05 14:08:59 +00:00
Pawan Nirpal
a5ba6067d6
[Clang][NFC] Use Hex Encoding for Intel CPU CPUID family (#153004)
Use Hex Encoding for CPUID family to match number format with Intel ISE
rev.58:
https://cdrdv2.intel.com/v1/dl/getContent/671368
2025-08-14 18:36:34 +02:00
Daniel Paoliello
7694856fdd
Fix TargetParserTests for big-endian hosts (#152407)
The new `sys::detail::getHostCPUNameForARM` for Windows (#151596) was
implemented using a C++ bit-field, which caused the associated unit
tests to fail on big-endian machines as it assumed a little-endian
layout.

This change switches from the C++ bit-field to LLVM's `BitField` type
instead.
2025-08-06 16:50:28 -07:00
Daniel Paoliello
a418fa7cdc
[win][aarch64] Add support for detecting the Host CPU on Arm64 Windows (#151596)
Uses the `CP 4000` registry keys under
`HKLM\HARDWARE\DESCRIPTION\System\CentralProcessor\*` to get the
Implementer and Part, which is then provided to a modified form of
`getHostCPUNameForARM` to map to a CPU.

On my local Surface Pro 11 `llc --version` reports:
```
> .\build\bin\llc.exe --version
LLVM (http://llvm.org/):
  LLVM version 22.0.0git
  Optimized build with assertions.
  Default target: aarch64-pc-windows-msvc
  Host CPU: oryon-1
```
2025-08-06 11:39:41 -07:00
Daniel Paoliello
4adce336f4
[win][arm64ec] Fixes to unblock building LLVM and Clang as Arm64EC (#150068)
These changes allow LLVM and Clang to be built with Clang targeting
Arm64EC using the MSVC linker.

Built with these options:
```
-DLLVM_ENABLE_PROJECTS="clang"
-DLLVM_HOST_TRIPLE=arm64ec-pc-windows-msvc
-DCMAKE_C_COMPILER=clang-cl.exe
-DCMAKE_C_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_CXX_COMPILER=clang-cl.exe
-DCMAKE_CXX_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_LINKER_TYPE=MSVC
```
2025-07-31 09:30:05 -07:00
Kazu Hirata
96bde11e30
[TargetParser] Remove const from a return type (NFC) (#149255)
getHostCPUFeatures constructs and returns a temporary instance of
StringMap<bool>.  We don't need const on the return type.
2025-07-17 07:22:51 -07:00
Elvina Yakubova
69835d8f6d
[clang][AArch64] Parse more features in getHostCPUFeatures (#146323)
Add parsing of some crypto features to display them properly when
-mcpu=native is used
2025-07-09 11:43:08 +01:00
Ricardo Jesus
84e54515bc
[AArch64] Add support for -mcpu=gb10. (#146515)
This patch adds support for -mcpu=gb10 (NVIDIA GB10). This is a
big.LITTLE cluster of Cortex-X925 and Cortex-A725 cores. The appropriate
MIDR numbers are added to detect them in -mcpu=native.

We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does
include the crypto instructions which we want on by default, and the
current convention is to not enable such extensions for Arm Cortex cores
in -mcpu where they are optional in the IP.

Relevant GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687005.html
2025-07-07 11:14:26 +01:00
Kazu Hirata
29b2b2263f
[TargetParser] Use StringRef::consume_front (NFC) (#147202)
While we are at it, this patch switches to a range-based for loop.
2025-07-06 19:05:45 -07:00
Pengcheng Wang
ce621041c2
[RISCV] Get host CPU name via hwprobe (#142745)
We can get the `mvendorid/marchid/mimpid` via hwprobe and then we
can compare these IDs with those defined in processors to find the
CPU name.

With this change, `-mcpu/-mtune=native` can set the proper name.
2025-06-12 16:39:57 +08:00
Ties Stuij
269f5fe91e
[AARCH64] Add support for Cortex-A320 (#139055)
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.

For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320

---------

Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
2025-05-09 16:24:48 +01:00
Ulrich Weigand
80267f8148
Support z17 processor name and scheduler description (#135254)
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.

This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
2025-04-11 00:20:58 +02:00
Ricardo Jesus
847e46ca01
[AArch64] Add initial support for -mcpu=olympus. (#132368)
This patch adds support for the NVIDIA Olympus core.

This does not add any special tuning decisions, and those may come
later.
2025-03-25 08:09:04 +00:00
Craig Topper
5dc815503f
[RISCV] Add ESWIN EIC770X (SiFive P550) to getHostCPUNameForRISCV. (#125277)
This enables -mcpu=native for the HiFive Premier P550 board.
2025-01-31 17:12:34 -08:00
tangaac
19834b4623
[LoongArch] Support sc.q instruction for 128bit cmpxchg operation (#116771)
Two options for clang
  -mno-scq:                Disable sc.q instruction.
  -mscq:                   Enable sc.q instruction.
The default is -mno-scq.
2025-01-23 12:11:07 +08:00
Ulrich Weigand
8424bf207e [SystemZ] Add support for new cpu architecture - arch15
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10305.

Note: No currently available Z system supports the arch15
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2025-01-20 19:30:21 +01:00
Mads Marquart
a082cc145f
Add Apple M4 host detection (#117530)
Add Apple M4 host detection, which fixes
https://github.com/rust-lang/rust/issues/133414.

Also add support for older ARM families (this is likely never going to
get used, since only macOS is officially supported as host OS, but nice
to have for completeness sake). Error handling (checking
`CPUFAMILY_UNKNOWN`) is also included here.

Finally, add links to extra documentation to make it easier for others
to update this in the future.

NOTE: These values are taken from `mach/machine.h` the Xcode 16.2 SDK,
and has been confirmed on an M4 Max in
https://github.com/rust-lang/rust/issues/133414#issuecomment-2499123337.
2025-01-16 08:15:12 -08:00
Craig Topper
fd38a95586 [TargetParser] Use StringRef::split that takes a char separator instead of StringRef separator. NFC 2025-01-04 12:31:46 -08:00
pcc
38eaea73ca
TargetParser: AArch64: Add part numbers for Apple CPUs.
Part numbers taken from:
https://github.com/AsahiLinux/m1n1/blob/main/src/chickens.c

Reviewers: ahmedbougacha, jroelofs

Reviewed By: jroelofs

Pull Request: https://github.com/llvm/llvm-project/pull/119777
2024-12-12 16:52:58 -08:00
Kinoshita Kotaro
a1197a2ca8
[AArch64] Add initial support for FUJITSU-MONAKA (#118432)
This patch adds initial support for FUJITSU-MONAKA CPU (-mcpu=fujitsu-monaka).

The scheduling model will be corrected in the future.
2024-12-09 09:56:02 +09:00
Phoebe Wang
a63931292b
[X86] Fix typo of gracemont (#118486) 2024-12-03 20:56:52 +08:00
Phoebe Wang
3348b4688f
[X86][compiler-rt] Split CPU names even they have the same subtype (#118237)
Fixes: #118205
2024-12-02 18:51:19 +08:00
tangaac
427be07675
[LoongArch] Support amcas[_db].{b/h/w/d} instructions. (#114189)
Two options for clang: -mlamcas & -mno-lamcas.
Enable or disable amcas[_db].{b/h} instructions.
The default is -mno-lamcas.
Only works on LoongArch64.
2024-11-27 17:36:13 +08:00
tangaac
f4379db496
[LoongArch] Support LA V1.1 feature that div.w[u] and mod.w[u] instructions with inputs not signed-extended. (#116764)
Two options for clang
-mdiv32: Use div.w[u] and mod.w[u] instructions with input not
sign-extended.
-mno-div32: Do not use div.w[u] and mod.w[u] instructions with input not
sign-extended.
The default is -mno-div32.
2024-11-26 21:57:29 +08:00
Weining Lu
e70f9e2096 [LoongArch] Remove the added in #116762 2024-11-25 09:33:55 +08:00
tangaac
1d4602070f
[LoongArch] Support LA V1.1 feature ld-seq-sa that don't generate dbar 0x700. (#116762)
Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa
2024-11-22 17:34:15 +08:00
Freddy Ye
97836bed63
Reland "[X86] Support -march=diamondrapids (#113881)" (#116564)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-18 10:40:32 +08:00
Freddy Ye
90e92239bd
Revert "[X86] Support -march=diamondrapids (#113881)" (#116563)
This reverts commit 826b845c9e97448395431be3e4e5da585bd98c5e.
2024-11-18 08:45:28 +08:00
Freddy Ye
826b845c9e
[X86] Support -march=diamondrapids (#113881)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-18 08:31:17 +08:00
tangaac
2283d50447
[LoongArch] add la v1.1 features for sys::getHostCPUFeatures (#115832)
Two features (i.e. `frecipe` and `lam-bh`) are added to
`sys.getHostCPUFeatures`. More features will be added in future.

In addition, this patch adds the features returned by
`sys.getHostCPUFeature` when `-march=native`.
2024-11-14 11:25:32 +08:00