580 Commits

Author SHA1 Message Date
Mirko Brkušanin
5d9eb0c76a
[AMDGPU] Define new targets gfx1171 and gfx1172 (#187735) 2026-04-01 18:16:11 +02:00
Matt Arsenault
d5d32d3052
Triple: Expose parseArch as a public method (#189648)
Clang has some code which is doing a direct arch name
string compare which should really be recognizing anything
usable as a triple architecture. It makes more sense to
directly parse the architecture than to construct a temporary
triple just to see what the parsed arch is.

For some reason the existing public parsing method is
getArchTypeForLLVMName. I'm not fully sure what the difference 
between the 2 is supposed to be. My current guess is 
getArchTypeForLLVMName is only supposed to handle the 
canonical architecture name.
2026-03-31 13:26:38 +00:00
Chinmay Deshpande
e044c4ad81
[AMDGPU] Add target features for SWMMAC instructions (#185785)
Introduce `swmmac-gfx1200-insts` and `swmmac-gfx1250-insts`
2026-03-18 13:52:34 -07:00
Phoebe Wang
3fadb15afa
[X86][APX] Combine MOVABS+JMP to JMPABS when in no-PIC large code model (#186402) 2026-03-16 16:16:27 +08:00
Jay Foad
2e614f3538
[TargetParser] Introduce AMDGPUTargetParser.def. NFCI. (#186137)
Define AMDGPU GPUs in a separate .def file similar to other targets, so
they are listed in just one place instead of three.
2026-03-12 16:10:41 +00:00
Jay Foad
1f704b871a
[TargetParser] Simplify getArchFamilyNameAMDGCN. NFC. (#186122) 2026-03-12 15:35:22 +00:00
Justin Bogner
6e93c4a19d
[DirectX] Specify element-aligned vectors (#180622)
Use the new "ve" Data Layout specifier to indicate that vectors are
element-aligned for the target.

Part of #123968
2026-03-11 15:49:48 -07:00
Joseph Huber
5a88dffc40
[Clang] Only define wchar_size module flag if non-standard (#184668)
Summary:
This PR simply changes the behavior of the `wchar_size` flag. Currently,
we emit this in all cases for all targets. This causes problems during
LLVM-IR linking, specifically because this would vary between Linux and
Windows in unintuitive ways. Now we have an llvm::Triple helper to
determine the size from the known values. The module flag will only be
emitted if these do not match (indicating a non-standard environment).

In addition to fixing AMDGCN bitcode linking, this also means we don't
need to bloat *every* IR module compiled by clang with this flag. The
changed tests reflects this, one less unnecessary piece of metadata.
2026-03-04 16:13:48 -06:00
Ankit Aggarwal
5e6f0c45a8
[Clang][Hexagon] Add QURT as recognized OS in target triple (#183622)
Add support for the QURT as a recognized OS type in the LLVM triple
system, and define the __qurt__ predefined macro when targeting it.
2026-02-26 16:24:13 -08:00
Stanislav Mekhanoshin
33fd75f55d
[AMDGPU] Add gfx12-5-generic subtarget (#183381)
This is functionally equivalent to gfx1250.
2026-02-25 13:34:48 -08:00
Mirko Brkušanin
790bef9d46
[AMDGPU] Remove V_DOT2ACC_F32_F16 from gfx1170 (#182088) 2026-02-18 20:16:22 +01:00
Mirko Brkušanin
829afc4c91
[AMDGPU] Add WMMA and SWMMAC instructions for gfx1170 (#180731)
Introduce two new subtarget features:

- WMMA256bInsts for GFX11 WMMA instructions and
- WMMA128bInsts for GFX1170 and GFX12 WMMA and SWMMAC instructions

Some WMMA instructions have changed from GFX 11.0 to GFX 11.7 so new
Real versions were added with "_gfx1170" suffix. For consistency all
WMMA and SWMMAC GFX11.7 instructions use this suffix.

To resolve decoding issues between different formats for some WMMA
instructions between GFX 11 and GFX 11.7, new decoding tables were
added.
2026-02-18 19:17:48 +01:00
Craig Topper
a2e14e41cf
[RISCV] Add Xsfmm32a shorthand extension. (#181957)
This extension is shorthand for Xsfmm32a8i, Xsfmm32a16f, and
Xsfmm32a32f.

It was mistakenly left out of an earlier version of the public
specification, but is now present. See
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification
2026-02-17 21:18:52 -08:00
Sam Elliott
e640d38b0d
[RISCV] Simplify Extension Predicates, Compatibility (#181255)
This pushes some of our simplifications to extension dependencies into
other parts of RISCVISAInfo and into the tablegen predicates.

The key affected pieces are:
- Error messages around Zcd incompatibilities now reference only `zcd`.
- We now have a big list of extensions that are rv32-only.
2026-02-13 18:28:41 -08:00
Sam Elliott
96c7a1148d
[RISCV] Combine Xqci Extensions in Arch Strings (#181033)
There are no instructions in the Xqci extension itself, it is just an
alias of a group. If we have all the items in the group, then we should
add `xqci` to the list of extensions we have.

This helps with multilib matching.
2026-02-12 13:50:57 -08:00
Marcos Maronas
ce94d63f0f
Make OpenCL an OSType rather than an EnvironmentType. (#170297)
OpenCL was added as an `EnvironmentType` in
https://github.com/llvm/llvm-project/pull/78655, but there is no
explanation as to why it was added as such, even after explicitly asking
in the PR
(https://github.com/llvm/llvm-project/pull/78655#issuecomment-2743162853).
This PR makes it an `OSType` instead, which feels more natural, and
updates tests accordingly.

---------

Co-authored-by: Marcos Maronas <marcos.maronas@intel.com>
2026-02-10 18:45:50 +00:00
Mirko Brkušanin
4280f0d241
[AMDGPU] Add dot4 fp8/bf8 instructions for gfx1170 (#180516) 2026-02-10 12:14:49 +01:00
Ruoyu Qiu
da0ad392ff
[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling (#180468)
Reland #174731, resolve cyclic dependency issue.

The use of LLVM_Object in LLVM_Util would cause cyclic dependency.
Fix cyclic dependency by reimplement `getFeatureSetFromEFlag()`.

Original description:

---

This PR updates llvm-objdump to detect the specific AVR architecture
from the ELF header flags when no specific CPU is provided.

Fixes: https://github.com/llvm/llvm-project/issues/146451

Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
2026-02-09 21:10:14 +08:00
Mirko Brkušanin
45b037cf7a
[AMDGPU] Add fp8/bf8 conversion instructions for gfx1170 (#180191) 2026-02-09 13:56:43 +01:00
Ganesh
a362593e0d
[X86] AMD Zen 6 Initial enablement (#179150)
This patch adds initial support for AMD Zen 6 architecture (znver6):

- Added znver6 CPU target recognition in Clang and LLVM
- Updated compiler-rt CPU model detection for znver6
- Added znver6 to target parser and host CPU detection
- Added znver6 to various optimizer tests

znver6 features: FP16, AVXVNNIINT8, AVXNECONVERT, AVXIFMA (without BMM).
2026-02-07 09:38:10 +05:30
Henrik G. Olsson
eff21afae0
Revert "[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling" (#180252)
Reverts llvm/llvm-project#174731 due to introducing a cyclic dependency
when building LLVM with modules enabled: LLVM_Utils -> LLVM_Object ->
LLVM_Utils
2026-02-06 19:00:32 +00:00
Mirko Brkušanin
20b5849e17
[AMDGPU] Define new target gfx1170 (#180185) 2026-02-06 14:38:50 +01:00
Ruoyu Qiu
d005cb2953
[llvm-objdump][AVR] Detect AVR architecture from ELF flags for disassembling (#174731)
This PR updates llvm-objdump to detect the specific AVR architecture
from the ELF header flags when no specific CPU is provided.

Fixes: #146451

---------

Signed-off-by: RuoyuQiu <cabbaken@outlook.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Co-authored-by: qiuruoyu <qiuruoyu@hygon.cn>
2026-02-06 08:58:12 +08:00
Min-Yih Hsu
6441f1c9d5
[RISCV] Introduce a new syntax for processor-specific tuning feature strings (#175063)
This patch proposes new a tuning feature string format that helps users
to build a performance model by "configuring" an existing tune CPU,
along with its scheduling model. For example, this string
```
"sifive-x280:single-element-vec-fp64"
```
takes ``sifive-x280`` as the "base" tune CPU and configured it with
``single-element-vec-fp64``. This gives us a performance model that
looks exactly like that of ``sifive-x280``, except some of the 64-bit
vector floating point instructions now produce only a single element per
cycle due to ``single-element-vec-fp64``.

This string could eventually be used in places like ``-mtune`` at the
frontend. Right now, this patch only implements the parser part, which
is put under the TargetParser library.

The grammar for this string is:
```
    tune-cpu      ::= 'tuning CPU name in lower case'
    directive     ::= "[a-zA-Z0-9_-]+"
    tune-features ::= directive ["," directive]*
```
A *directive* can and can only _enable_ or _disable_ a certain tuning
feature from the tuning CPU. A **positive directive**, like the
``single-element-vec-fp64`` we just saw, enables an additional tuning
feature in the associated tuning model.

A **negative directive**, on the other hand, removes a certain tuning
feature. For example, ``sifive-x390`` already has the
``single-element-vec-fp64`` feature, and we can use
"sifive-x390:no-single-element-vec-fp64" to create a new performance
model that looks nearly the same as ``sifive-x390`` except
``single-element-vec-fp64`` being cut out. In this case,
``no-single-element-vec-fp64`` is a negative directive.

There are additional restrictions on what we can put in the list of
directives, please refer to the documentations for more details.

Right now, this string only accepts directives that are explicitly
supported by the tune CPU. For example, "sifive-x280:prefer-w-inst" is
not a valide string as ``prefer-w-inst`` is not supported by
``sifive-x280`` at this moment. Vendors of these processors are expected
to maintain the compatibility of their supported directives across
different versions.

---------

Co-authored-by: Sam Elliott <aelliott@qti.qualcomm.com>
2026-02-05 15:22:07 -08:00
Ian Anderson
639a8d1f1d
[Triple] Make a target triple "os" for firmware (#176272)
Make a Triple::OSType to support a generic "firmware" OS that isn't bare
metal, but isn't tied to a specific hardware platform like macOS or iOS.
Hook up support for the new OSType in the Darwin toolchain.
2026-02-04 12:15:25 -08:00
Phoebe Wang
2f3935bcee
[X86][APX] Disable PP2/PPX generation on Windows (#178122)
The PUSH2/POP2/PPX instructions for APX require updates to the Microsoft
Windows OS x64 calling convention documented at
https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170
due to lack of suitable unwinder opcodes that can support APX
PUSH2/POP2/PPX.

The PR request disables this support by default for code robustness;
workloads that choose to explicitly enable this support can change the
default behavior by explicitly specifying the flag options that enable
this support e.g. for experimentation or code paths that do not need
unwinder support.
2026-02-02 18:01:44 +08:00
Mariusz Sikora
6de6f7b46b
[AMDGPU] Define gfx1310 target with ELF number 0x50 (#177355)
For now this is identical to gfx1250.

---------

Co-authored-by: Jay Foad <jay.foad@amd.com>
2026-01-22 17:08:38 +01:00
Nikita Popov
4fe4f23e2f
[TargetParser] Fix fp16 feature name for ARM64 Windows feature detection (#176925)
The feature is called fullfp16, not fp16, see:
979db00b9a/llvm/lib/Target/AArch64/AArch64Features.td (L142)
2026-01-22 09:23:10 +01:00
Jonas Paulsson
8eccda10d2
[SystemZ] Add SP alignment to the DataLayout string. (#176041)
Add '-S64' to the SystemZ datalayout string, to avoid overalignment of
stack objects.

Fixes #173402
2026-01-20 09:54:47 -06:00
Ricardo Jesus
9458d2a0f4
[AArch64][Driver] Allow runtime detection to override default features. (#176340)
Currently, most extensions controlled through -march and -mcpu options
are handled in a bitset of AArch64::ExtensionSet. However, extensions
detected at runtime for native compilation are handled in a separate
list of CPU features; once most of the parsing logic has run, the bitset
is converted to a feature list, added after the features detected at
runtime, and the resulting list is used from there on out.

This has the downside that runtime-detected features are unable to
override default CPU extensions. For example, if a CPU enables +aes in
its processor definition, but aes support is not detected at runtime,
the feature currently remains enabled---even though
unsupported---because default features are enabled after the runtime
logic attempts to disable them.

This patch inserts runtime-detected features directly into the extension
set such that these options can take precedence over extensions enabled
by default. The general parsing order for mcpu=native becomes:
1. CPU defaults;
2. Runtime detection;
3. +featureA+nofeatureB options;
4. Other parsing decisions.

This allows features that are found to be unsupported at runtime to be
removed from the list of features supported by targets that enable them
by default.

While at it, this also disables rng if not detected at runtime.
2026-01-20 13:09:17 +00:00
Shilei Tian
39bd4562ba
[Clang][AMDGPU] Handle wavefrontsize32 and wavefrontsize64 features more robustly (#176599)
We should not allow `-wavefrontsize32` and `-wavefrontsize64` to be
specified at the same time. We should also not allow `-wavefrontsize32`
on a target that only supports `wavefrontsize32`, and the vice versa.
2026-01-19 18:16:29 -05:00
hev
0a9d480fad
[clang][LoongArch] Add support for LoongArch32 (#172619)
This patch adds support for LoongArch32, as introduced in
la-toolchain-conventions v1.2.

Co-authored-by: Sun Haiyong <sunhaiyong@zdbr.net>
Link:
https://github.com/loongson/la-toolchain-conventions/releases/tag/releases%2Fv1.2
Link:
https://gcc.gnu.org/pipermail/gcc-patches/2025-December/703312.html
2026-01-17 16:27:54 +08:00
Henry Linjamäki
9587892183
[Triple] Add "chipstar" OS components (#170655)
This new component is for Clang driver for selecting HIPSPV toolchain.
2026-01-16 08:24:58 -06:00
Shoreshen
26624d51d1
[AMDGPU]Add specific instruction feature for multicast load (#175503) 2026-01-13 09:10:09 +08:00
Philipp Tomsich
43138d6272
[Aarch64] Add support for Ampere1C core (#175442)
This patch adds initial support for the ARMv9.2+ Ampere1C core.
2026-01-12 09:52:23 +01:00
Dan Gohman
597ffbe09d
Rename wasm32-wasi to wasm32-wasip1. (#165345)
This adds code to recognize "wasm32-wasip1", "wasm32-wasip2", and
"wasm32-wasip3" as explicit targets, and adds a deprecation warning when
the "wasm32-wasi" target is used, pointing users to the "wasm32-wasip1"
target.

Fixes #165344.

I'm filing this as a draft PR for now, as I've only just now proposed to
make this change in #165344.
2026-01-10 00:09:06 +00:00
Craig Topper
bafbf2d58d
[RISCV] Add rules for Zca+Zcb+Zcmp+Zcmpt implying Zce. (#175041)
The implication rules need to consider whether F is enabled like was
done for C in #172860.
2026-01-08 20:07:02 -08:00
Francesco Petrogalli
75d025124a
[RISCV] Add basic Mach-O triple support. (#141682)
Based on a patch written by Tim Northover (https://github.com/TNorthover).
2026-01-05 23:18:48 +00:00
Jerry Zhang Jian
fc69c804db
[RISCV] Implement conditional Zca implies C extension rule (#172860)
This change implements the conditional "Zca implies C" rule to match
GCC's behavior (PR119122) and the RISC-V specification for MISA.C.

The rule is:
  - For RV32:
    - No F and no D: Zca alone implies C
    - F but no D: Zca + Zcf implies C
    - F and D: Zca + Zcf + Zcd implies C
  - For RV64:
    - No D: Zca alone implies C
    - D: Zca + Zcd implies C

This fixes multilib matching issues where LLVM-generated march strings
didn't include the C extension when GCC's multilib configurations
expected it.

Reference:
  - GCC PR119122: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119122
- RISC-V Zc spec:
https://github.com/riscv/riscv-isa-manual/blob/main/src/zc.adoc

Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
2025-12-20 01:15:03 +08:00
Sudharsan Veeravalli
3bf0a8d6e1
[RISCV] Add Xqci feature flag (#172608)
This patch adds an experimental Xqci feature flag that covers all the
sub-extensions in the Qualcomm uC Extension.
2025-12-18 21:32:49 +05:30
Phoebe Wang
d6c2cd69cb
[X86][APX] Check APXSave before enabling APX features (#172834)
According to APX spec 3.1.4.2, APX instructions can normally execute
only when XCR0[APX_F]=1, where APX_F=19.
2025-12-18 22:22:20 +08:00
Zachary Yedidia
2c05ae4b8f
[LFI] Introduce AArch64 LFI Target (#167061)
This PR is the first step towards introducing LFI into LLVM as a new
sub-architecture backend of AArch64. For details, please see the
[RFC](https://discourse.llvm.org/t/rfc-lightweight-fault-isolation-lfi-efficient-native-code-sandboxing-upstream-lfi-target-and-compiler-changes/88380),
which has been approved for AArch64.

This patch creates the `aarch64_lfi` architecture, and marks the
appropriate registers as reserved when it is targeted (`x25`, `x26`,
`x27`, `x28`). It also adds a Clang driver toolchain for targeting LFI,
and updates the compiler-rt CMake to allow builds for the `aarch64_lfi`
target. The patch also includes documentation for LFI and the rewrites
that will be implemented in future patches.

I am planning to split the relevant modifications for LFI into a series
of patches, organized as described below (after this one). Please let me
know if you'd like me to split the changes in a different way, or
provide one big patch.

1. The next patch will introduce the `MCLFIExpander` mechanism for
applying the MC-level rewrites needed by LFI, along with the
`.lfi_expand` and `.lfi_no_expand` assembly directives when targeting
LFI. A preview can be seen on the `lfi-project`
[fork](https://github.com/llvm/llvm-project/compare/main...lfi-project:llvm-project:lfi-patchset/aarch64-pr-2).

2. The following patch will create an `MCLFIExpander` for the AArch64
backend that performs LFI expansions. This patch will contain the
majority of the LFI-specific logic.

3. The final patch will add an optimization to the rewriter that can
eliminate redundant guard instructions that occur within the same basic
block.

We plan to introduce x86-64 support after further discussion and once
the `MCLFIExpander` infrastructure is in place.

Please let me know your feedback, and thank you very much for your help
and guidance in the review process.
2025-12-16 12:51:02 -08:00
dcandler
23f967ada0
[AArch64] Add support for C1 CPUs (#171124)
This patch adds initial support for the Arm v9.3 C1 processors:
* C1-Nano
* C1-Pro
* C1-Premium
* C1-Ultra

For more information on each, see:
https://developer.arm.com/Processors/C1-Nano
https://developer.arm.com/Processors/C1-Pro
https://developer.arm.com/Processors/C1-Premium
https://developer.arm.com/Processors/C1-Ultra

Technical Reference Manual for C1-Nano:
https://developer.arm.com/documentation/107753/latest/

Technical Reference Manual for C1-Pro:
https://developer.arm.com/documentation/107771/latest/

Technical Reference Manual for C1-Premium:
https://developer.arm.com/documentation/109416/latest/

Technical Reference Manual for C1-Ultra:
https://developer.arm.com/documentation/108014/latest/
2025-12-16 14:54:27 +00:00
Mikołaj Piróg
b6f210b215
[X86] Correct CPUID checks for AVX10 (#172350)
This corrects a wrong condition for avx10 (AVX10Ver is always set to
0/1) and corrects how CPUID for avx10 is queried: per ISE table 1-3 we
should query with EAX = 0x24 and ECX = 0x0 -- previously we omitted the
latter.

Issue reported by user Seraphimt here
https://discourse.llvm.org/t/test-for-sys-gethostcpufeatures/89130
2025-12-16 13:59:50 +01:00
Eli Friedman
1b4a74fcdc
[AArch64] Fix typo in 09e57cfd32b0073b63d568835f07251e0d51affb (#172354) 2025-12-15 11:15:59 -08:00
Eli Friedman
09e57cfd32
[AArch64] Extend Windows CPU feature detection with more features. (#171930)
Mostly adding feature flags from the newest SDK.

(Note that in addition to the obvious, this also affects the compiler-rt
SME ABI routines, which rely on FEAT_SME and FEAT_SME2.)
2025-12-15 10:56:17 -08:00
Nikita Popov
b7c0452a9a
[PowerPC][AIX] Specify correct ABI alignment for double (#144673)
Add `f64:32:64` to the data layout for AIX, to indicate that doubles
have a 32-bit ABI alignment and 64-bit preferred alignment.

Clang was already taking this into account, but it was not reflected in
LLVM's data layout.

A notable effect of this change is that `double` loads/stores with 4
byte alignment are no longer considered "unaligned" and avoid the
corresponding unaligned access legalization. I assume that this is
correct/desired for AIX. (The codegen previously already relied on this
in some places related to the call ABI simply by dint of assuming
certain stack locations were 8 byte aligned, even though they were only
actually 4 byte aligned.)

Fixes https://github.com/llvm/llvm-project/issues/133599.
2025-12-11 08:57:26 +01:00
Mirko Brkušanin
5759a3a779
[AMDGPU] Add s_wakeup_barrier instruction for gfx1250 (#170501) 2025-12-10 09:45:13 +01:00
Craig Topper
d18cdc99bc
[RISCVInsertVSETVLI] Don't allow getSEW/getLMUL to be called for hasSEWLMULRatioOnly(). NFC (#171554)
Refactor some logic in transferBefore to handle hasSEWLMULRatioOnly()
before calling getSEW/getLMUL.

Update adjustIncoming to use getSEWLMULRatio(). Update the interface of
RISCVVType::getSameRatioLMUL to take the ratio instead of SEW and LMUL.
Update the few other callers to call RISCVVType::getSEWLMULRatio first.
2025-12-09 22:06:15 -08:00
Alexandros Lamprineas
1b82c16fa8
[FMV][AArch64] Allow user to override version priority. (#150267)
Implements https://github.com/ARM-software/acle/pull/404

This allows the user to specify "featA+featB;priority=[1-255]" where
priority=255 means highest priority. If the explicit priority string is
omitted then the priority of "featA+featB" is implied, which is lower
than priority=1.

Internally this gets expanded using special FMV features P0 ... P7 which
can encode up to 256-1 priority levels (excluding all zeros). Those do
not have corresponding detection bit at pos FEAT_#enum so I made this
field optional in FMVInfo. Also they don't affect the codegen or name
mangling of versioned functions.
2025-12-09 13:31:10 +00:00