373 Commits

Author SHA1 Message Date
quic_hchandel
171d3edd05
[RISCV] Add Qualcomm uC Xqciint (Interrupts) extension (#122256)
This extension adds eleven instructions to accelerate interrupt
servicing.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.

---------

Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
2025-01-13 16:36:05 +05:30
Steven Perron
d723587686
[HLSL] Explicitly set the SPIR-V version with spv-target-env (#121961)
In DXC, setting the vulkan version automatically sets the target spir-v
version to the maximum spir-v version that the vulkan version must
support. So for Vulkan 1.2, we set the spir-v version to spirv 1.5
because every implementation of Vulkan 1.2 must support spirv 1.5, but
not spir-v 1.6.
2025-01-09 09:39:56 -05:00
Alexandros Lamprineas
8e65940161
[FMV][AArch64] Simplify version selection according to ACLE. (#121921)
Currently, the more features a version has, the higher its priority is.
We are changing ACLE https://github.com/ARM-software/acle/pull/370 as
follows:

"Among any two versions, the higher priority version is determined by
 identifying the highest priority feature that is specified in exactly
 one of the versions, and selecting that version."
2025-01-08 18:59:07 +00:00
quic_hchandel
737d6ca44d
[RISCV] Add Qualcomm uC Xqcicm (Conditional Move) extension (#121752)
The Qualcomm uC Xqcicm extension adds 13 conditional move instructions.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.

---------

Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
2025-01-07 08:25:00 +05:30
Craig Topper
fd38a95586 [TargetParser] Use StringRef::split that takes a char separator instead of StringRef separator. NFC 2025-01-04 12:31:46 -08:00
Sudharsan Veeravalli
532a2691bc
[RISCV] Add Qualcomm uC Xqcicli (Conditional Load Immediate) extension (#121292)
This extension adds 12 instructions that conditionally load an immediate
value.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2025-01-03 06:33:27 +05:30
quic_hchandel
1557eeda73
[RISCV] Add Qualcomm uC Xqciac (Load-Store Adress calculation) extension (#121037)
This extension adds 3 instructions that perform load-store address
calculation.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.

---------

Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
2024-12-29 11:14:12 +05:30
Nick Sarnie
1c16807d0d
[LLVM] Add Intel vendor in Triple (#120250)
We plan to make use of this in SPIR-V-based OpenMP offloading, for which
there is already an initial patch in review.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2024-12-17 12:30:21 -06:00
Phoebe Wang
90968794e2
[X86] Add missing feature USERMSR to DiamondRapids (#120061)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-12-16 20:29:26 +08:00
Sudharsan Veeravalli
668d9688ac
[RISCV] Add Qualcomm uC Xqcilsm (Load Store Multiple) extension (#119823)
This extension adds 6 instructions that can do multi-word load/store.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2024-12-14 00:06:58 +05:30
pcc
38eaea73ca
TargetParser: AArch64: Add part numbers for Apple CPUs.
Part numbers taken from:
https://github.com/AsahiLinux/m1n1/blob/main/src/chickens.c

Reviewers: ahmedbougacha, jroelofs

Reviewed By: jroelofs

Pull Request: https://github.com/llvm/llvm-project/pull/119777
2024-12-12 16:52:58 -08:00
quic_hchandel
0614c601b4
[RISCV] Add Qualcomm uC Xqcics(Conditional Select) extension (#119504)
The Qualcomm uC Xqcics extension adds 8 conditional select instructions.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.

---------

Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
2024-12-12 11:12:09 +05:30
Alexandros Lamprineas
6f013dbced
[AArch64][FMV] Add missing feature dependencies and detect at runtime. (#119231)
i8mm -> simd
fp16fml -> simd
frintts -> fp
bf16 -> simd
sme -> fp16

Approved in ACLE as https://github.com/ARM-software/acle/pull/368
2024-12-11 22:11:32 +00:00
Kinoshita Kotaro
a1197a2ca8
[AArch64] Add initial support for FUJITSU-MONAKA (#118432)
This patch adds initial support for FUJITSU-MONAKA CPU (-mcpu=fujitsu-monaka).

The scheduling model will be corrected in the future.
2024-12-09 09:56:02 +09:00
Phoebe Wang
a63931292b
[X86] Fix typo of gracemont (#118486) 2024-12-03 20:56:52 +08:00
Phoebe Wang
3348b4688f
[X86][compiler-rt] Split CPU names even they have the same subtype (#118237)
Fixes: #118205
2024-12-02 18:51:19 +08:00
Sudharsan Veeravalli
6881c6d2a6
[RISCV] Add Qualcomm uC Xqcia (Arithmetic) extension (#118113)
This extension adds 11 instructions that perform integer arithmetic.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2024-12-01 17:06:22 +05:30
Sudharsan Veeravalli
8fcbba82d6
[RISCV] Add Qualcomm uC Xqcisls (Scaled Load Store) extension (#117987)
This extension adds 8 load/store instructions with a scaled index
addressing mode.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2024-11-29 10:26:00 +05:30
Alexandros Lamprineas
88c2af80fa
[NFC][clang][FMV][TargetInfo] Refactor API for FMV feature priority. (#116257)
Currently we have code with target hooks in CodeGenModule shared between
X86 and AArch64 for sorting MultiVersionResolverOptions. Those are used
when generating IFunc resolvers for FMV. The RISCV target has different
criteria for sorting, therefore it repeats sorting after calling
CodeGenFunction::EmitMultiVersionResolver.

I am moving the FMV priority logic in TargetInfo, so that it can be
implemented by the TargetParser which then makes it possible to query it
from llvm. Here is an example why this is handy:
https://github.com/llvm/llvm-project/pull/87939
2024-11-28 09:22:05 +00:00
Sudharsan Veeravalli
c4645ffeda
[RISCV] Add Qualcomm uC Xqcicsr (CSR) extension (#117169)
The Qualcomm uC Xqcicsr extension adds 2 instructions that can read and
write CSRs.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2024-11-28 12:46:15 +05:30
tangaac
427be07675
[LoongArch] Support amcas[_db].{b/h/w/d} instructions. (#114189)
Two options for clang: -mlamcas & -mno-lamcas.
Enable or disable amcas[_db].{b/h} instructions.
The default is -mno-lamcas.
Only works on LoongArch64.
2024-11-27 17:36:13 +08:00
Matt Arsenault
5615657209
AMDGPU: Builtin & CodeGen support for v_cvt_sr_{bf16|f16}_f32 instructions (#117824)
Co-authored-by: Shilei Tian <shilei.tian@amd.com>
2024-11-26 23:37:05 -05:00
Matt Arsenault
62dc8f3069
AMDGPU: Add builtins & codegen support for bitop3_b{16|32} of gfx950. (#117823)
Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2024-11-26 23:33:07 -05:00
Matt Arsenault
0f4fcca546
AMDGPU: Builtin & CodeGen support for v_cvt_scalef32_pk32_f32_[fp|bf]6 for gfx950 (#117745)
Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2024-11-26 19:26:07 -05:00
Matt Arsenault
2b9e947d43
AMDGPU: Builtins & Codegen support for v_cvt_scale_fp4<->f32 for gfx950 (#117743)
OPSEL ASM Syntax for v_cvt_scalef32_pk_f32_fp4 : opsel:[x,y,z]
where, x & y i.e. OPSEL[1 : 0] selects which src_byte to read.

OPSEL ASM Syntax for v_cvt_scalef32_pk_fp4_f32 : opsel:[a,b,c,d]
where, c & d i.e. OPSEL[3 : 2] selects which dst_byte  to write.

Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2024-11-26 19:20:09 -05:00
Matt Arsenault
815069c701
AMDGPU: Builtins & Codegen support for: v_cvt_scalef32_[f16|f32]_[bf8|fp8] (#117739)
OPSEL[1:0] collectively decide which byte to read
from src input.

Builtin takes additional imm argument which
represents index (with valid values:[0:3]) of src
byte read. Out of bounds checks will added in next
patch.

OPSEL ASM Syntax: opsel:[x,y,z]
where,
    opsel[x] = Inst{11} = src0_modifier{2}
    opsel[y] = Inst{12} = src1_modifier{2}
    opsel[z] = Inst{14} = src0_modifier{3}

Note: Inst{13} i.e. OPSEL[2] is ignored in
asm syntax and opsel[z] is meaningless
for v_cvt_scalef32_f32_{fp|bf}8

Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2024-11-26 14:54:10 -05:00
tangaac
f4379db496
[LoongArch] Support LA V1.1 feature that div.w[u] and mod.w[u] instructions with inputs not signed-extended. (#116764)
Two options for clang
-mdiv32: Use div.w[u] and mod.w[u] instructions with input not
sign-extended.
-mno-div32: Do not use div.w[u] and mod.w[u] instructions with input not
sign-extended.
The default is -mno-div32.
2024-11-26 21:57:29 +08:00
Matt Arsenault
7fc71f7909
AMDGPU: Support buffer_atomic_pk_add_bf16 for gfx950 (#117599)
Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>
2024-11-25 19:54:50 -08:00
Matt Arsenault
716364ebd6
AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598)
The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a,
both from gfx9 series. This required a new decoderNameSpace GFX950_DOT.

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>
2024-11-25 19:51:01 -08:00
Matt Arsenault
aa7eb5723c
AMDGPU: Add support for v_dot2_f32_bf16 instruction for gfx950 (#117597)
v_dot2_f32_bf16 was added in gfx11 along with v_dot2_f16_f16 and v_dot2_bf16_bf16.
All three instructions were part of Dot9 instructions in the compiler.

This patch will split existing dot9 (v_dot2_f16_f16, v_dot2_bf16_bf16, v_dot2_f32_bf16)
into new dot9 (v_dot2_f16_f16 and v_dot2_bf16_bf16), and dot12 (v_dot2_f32_bf16).

All necessary changes to gfx11 and gfx12 are updated to reflect this change.

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>
2024-11-25 19:47:48 -08:00
Matt Arsenault
5d650a62a3
AMDGPU: Add support for v_ashr_pk_i8/u8_i32 instructions for gfx950 (#117596)
This patch adds assembly and builtin support for v_ashr_pk_i8/u8_i32
instructions.

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>
2024-11-25 19:44:47 -08:00
Matt Arsenault
22503a9df1
AMDGPU: Support v_cvt_scalef32_pk32_{bf|f}6_{bf|fp}16 for gfx950 (#117592)
Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2024-11-25 19:27:01 -08:00
Weining Lu
e70f9e2096 [LoongArch] Remove the added in #116762 2024-11-25 09:33:55 +08:00
Matt Arsenault
d1cca3133a
AMDGPU: Add v_permlane16_swap_b32 and v_permlane32_swap_b32 for gfx950 (#117260)
This was a bit annoying because these introduce a new special case
encoding usage. op_sel is repurposed as a subset of dpp controls,
and is eligible for VOP3->VOP1 shrinking. For some reason fi also
uses an enum value, so we need to convert the raw boolean to 1 instead
of -1.

The 2 registers are swapped, so this has 2 defs. Ideally the builtin
would return a pair, but that's difficult so return a vector instead.
This would make a hypothetical builtin that supports v2f16 directly
uglier.
2024-11-22 20:12:50 -08:00
Pengcheng Wang
875b10f7d0 [RISCV] Support __builtin_cpu_is
We have defined `__riscv_cpu_model` variable in #101449. It contains
`mvendorid`, `marchid` and `mimpid` fields which are read via system
call `sys_riscv_hwprobe`.

We can support `__builtin_cpu_is` via comparing values in compiler's
CPU definitions and `__riscv_cpu_model`.

This depends on #116202.

Reviewers: lenary, BeMg, kito-cheng, preames, lukel97

Reviewed By: lenary

Pull Request: https://github.com/llvm/llvm-project/pull/116231
2024-11-22 22:58:54 +08:00
Pengcheng Wang
4da960b898 [RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 22:58:54 +08:00
Mikhail Goncharov
d1dae1e861 Revert "[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)" chain
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.

multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
2024-11-22 14:09:13 +01:00
Wang Pengcheng
b36fcf4f49 [RISCV] Rename variable CPUModel to Model
The variable name can't be the same as the struct name or we will
have "error: declaration of ‘llvm::RISCV::CPUModel llvm::RISCV::CPUInfo::CPUModel’
changes meaning of ‘CPUModel’ [-fpermissive]".
2024-11-22 20:12:28 +08:00
Pengcheng Wang
c11b6b1b8a
[RISCV] Support __builtin_cpu_is
We have defined `__riscv_cpu_model` variable in #101449. It contains
`mvendorid`, `marchid` and `mimpid` fields which are read via system
call `sys_riscv_hwprobe`.

We can support `__builtin_cpu_is` via comparing values in compiler's
CPU definitions and `__riscv_cpu_model`.

This depends on #116202.

Reviewers: lenary, BeMg, kito-cheng, preames, lukel97

Reviewed By: lenary

Pull Request: https://github.com/llvm/llvm-project/pull/116231
2024-11-22 20:04:57 +08:00
Pengcheng Wang
775148f236
[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 19:54:45 +08:00
tangaac
1d4602070f
[LoongArch] Support LA V1.1 feature ld-seq-sa that don't generate dbar 0x700. (#116762)
Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa
2024-11-22 17:34:15 +08:00
Joseph Huber
7672216ed7
[LLVM] Add environment triple for 'llvm' (#117218)
Summary:
The LLVM C library is an in-development environment for running
executables on various systems. Similarly how we have `-gnu` to indicate
that we are using a GNU toolchain we should support `-llvm` to indicate
the LLVM C library. This patch only adds the basic support for the
triple and does not do any necessary clang changes to handle compiling
with it.

Fixes https://github.com/llvm/llvm-project/issues/117251
2024-11-21 17:30:18 -06:00
Kazu Hirata
4d6d56315d
[TargetParser] Remove unused includes (NFC) (#116929)
Identified with misc-include-cleaner.
2024-11-20 06:52:45 -08:00
Matt Arsenault
ca1b35a6c8
AMDGPU: Add v_prng_b32 instruction for gfx950 (#116310)
Rand num instruction for stochastic rounding.
2024-11-18 10:54:54 -08:00
Matt Arsenault
a6fc489bb7
AMDGPU: Add gfx950 subtarget definitions (#116307)
Mostly a stub, but adds some baseline tests and
tests for removed instructions.
2024-11-18 10:41:14 -08:00
Freddy Ye
97836bed63
Reland "[X86] Support -march=diamondrapids (#113881)" (#116564)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-18 10:40:32 +08:00
Freddy Ye
90e92239bd
Revert "[X86] Support -march=diamondrapids (#113881)" (#116563)
This reverts commit 826b845c9e97448395431be3e4e5da585bd98c5e.
2024-11-18 08:45:28 +08:00
Freddy Ye
826b845c9e
[X86] Support -march=diamondrapids (#113881)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-18 08:31:17 +08:00
SpencerAbson
748b028540
[AArch64] Make +sve2-aes an alias of +sve2+sve-aes (#116026)
This patch essentially re-lands
https://github.com/llvm/llvm-project/pull/114293 with the following
fixups

- `nosve2-aes` should disable the backend feature `FeatureSVEAES` such
that the set of existing instructions that this removes is unchanged.
- FMV dependencies now use the autogenerated `ExtensionDepencies`
structure (since https://github.com/llvm/llvm-project/pull/113281) so we
do not require the change to `AArch64FMV.td`.
2024-11-14 11:04:04 +00:00
tangaac
2283d50447
[LoongArch] add la v1.1 features for sys::getHostCPUFeatures (#115832)
Two features (i.e. `frecipe` and `lam-bh`) are added to
`sys.getHostCPUFeatures`. More features will be added in future.

In addition, this patch adds the features returned by
`sys.getHostCPUFeature` when `-march=native`.
2024-11-14 11:25:32 +08:00