2332 Commits

Author SHA1 Message Date
Brox Chen
9830156f62
[AMDGPU][True16][MC] add true16 and fake16 test file for vop3 instructions (#109695)
duplicating mc test, and updating proper flag for true16 and fake16 test
file for vop3 instructions. This is preparing for the up-coming VOP3
true16 changes
2024-09-24 15:01:15 -04:00
Scott Egerton
396f677514
[AMDGPU] Remove unused VGPRSingleUseHintInsts feature (#109769) 2024-09-24 10:58:00 +01:00
Phoebe Wang
70529b24a3
[X86][APX] Do not emit {evex} prefix for memory variant (#109759)
This was mistakely changed by #109579, which doesn't match with other
EVEX decoding.
2024-09-24 16:46:56 +08:00
Jun Wang
f6a8eb98b1
[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)
Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instruction. For
example, the instructions 'global_atomic_add v1, v2, null' and
'global_atomic_add v[1:2], v2, off' have the same encoding. This patch
disallows having null as saddr.
2024-09-24 11:08:41 +04:00
Phoebe Wang
0d334d83a4
[X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109579)
Promoted KMOV* was encoded with CD8 incorrectly, see
https://godbolt.org/z/cax513hG1
2024-09-23 09:41:43 +08:00
Brox Chen
0570ba6b05
[AMDGPU][True16][MC] true16 for more VOP1 instructions (#108412)
Support true16 and fake16 format for more VOP1 instructions in MC

This patch updates the true16 and fake16 vop_profile for the following
instructions and update the asm/dasm tests:
V_CVT_F16_U16
V_CVT_F16_I16
V_CVT_U16_F16
V_CVT_I16_F16
V_CVT_NORM_U16_F16
V_CVT_NORM_I16_F16
V_FREXP_EXP_I16_F16


Since this patch introduce fake16 instructions for V_CVT_F16_U16, it
address an issue in fix-sgprs-copy-f16 test which is brought up here
https://github.com/llvm/llvm-project/pull/104510#discussion_r1742499668
2024-09-20 11:11:28 -04:00
Mahesh-Attarde
311e4e3245
[X86][AVX10.2] Support AVX10.2 MOVZXC new Instructions. (#108537)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

Chapter 14 INTEL® AVX10 ZERO-EXTENDING PARTIAL VECTOR COPY INSTRUCTIONS

---------

Co-authored-by: mattarde <mattarde@intel.com>
2024-09-18 21:01:51 +08:00
Mahesh-Attarde
f5ad9e1ca5
[X86][AVX10.2] Support AVX10.2-COMEF new instructions. (#108063)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

Chapter 8  AVX10 COMPARE SCALAR FP WITH ENHANCED EFLAGS INSTRUCTIONS

---------

Co-authored-by: mattarde <mattarde@intel.com>
2024-09-18 17:55:29 +08:00
Heejin Ahn
97ae505753
[WebAssembly] Support disassembler for try_table (#108800)
This adds support for disassembler for the new `try_table` instruction.
This adds tests for `throw` and `throw_ref` as well.

Currently tag expressions are not supported for `throw` or `try_table`
instruction when instructions are parsed from the disassembler. Not sure
whether there is a way to support it. (This is not a new thing for the
new EH proposal; it has not been supported for the legacy EH as well.)
2024-09-16 20:08:37 -07:00
Malay Sanghi
a409ebc1fc
[X86][AVX10.2] Support AVX10.2-SATCVT-DS new instructions. (#102592)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-09-12 22:45:20 +08:00
Brox Chen
35e27c0ee5
[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)
This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch includes the asm/disasm changes to encode/decode the 16bit
vsrc, vdst and src modifieres for vop and dpp format. This patch is a
dependency for many 16 bit instructions while only three instructions
are updated to make it easier to review.

There will be another patch to support these three instructions in the
codeGen level, this patch just replaces these two instructions with its
fake16 format.
2024-09-11 10:48:11 -04:00
Freddy Ye
83ad644afa
[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-09-04 08:13:24 +08:00
Brox Chen
74938ab84d
[AMDGPU][True16][MC] add true16/fake16 flag to gfx12 dasm tests (#106469)
add true16/fake16 flag to gfx12 dasm tests including vop1, vop1_dpp,
vop3_from_vop1 and vop3_from_vop1_dpp. This is a test only change.
2024-08-29 15:21:25 -04:00
Freddy Ye
36b7c30b29
[X86, MC] Recognize OSIZE=64b when EVEX.W = 1, EVEX.pp = 01 (#103816)
In the legacy space, if both the 66 prefix and REX.W=1 are present, the
REX.W=1 takes precedence and makes OSIZE=64b. EVEX map 4 inherits this
convention, with EVEX.pp=01 and EVEX.W playing the roles of the 66
prefix and REX.W. So if EVEX.pp=00, the OSIZE can only be 64b or 32b,
depending on whether EVEX.W=1 or not. But if EVEX.pp=01, then OSIZE is
either 64b or 16b depending on whether EVEX.W=1 or not.
2024-08-29 18:22:26 +08:00
Tomas Matheson
68e21e16d2
[AArch64] Add support for ACTLR_EL12 system register (#105497)
Documentation can be found here:

https://developer.arm.com/documentation/ddi0601/2024-06/AArch64-Registers/ACTLR-EL1--Auxiliary-Control-Register--EL1-
2024-08-21 15:15:49 +01:00
Freddy Ye
7c4cadfc43
[X86][AVX10.2] Support AVX10.2-CONVERT new instructions. (#101600)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-21 15:44:06 +08:00
Tomas Matheson
362142c4bb
[AArch64] Add a check for invalid default features (#104435)
This adds a check that all ExtensionWithMArch which are marked as
implied features for an architecture are also present in the list of
default features. It doesn't make sense to have something mandatory but
not on by default.

There were a number of existing cases that violated this rule, and some
changes to which features are mandatory (indicated by the Implies
field).

This resulted in a bug where if a feature was marked as `Implies` but
was not added to `DefaultExt`, then for `-march=base_arch+nofeat` the
Driver would consider `feat` to have never been added and therefore
would do nothing to disable it (no `-target-feature -feat` would be
added, but the backend would enable the feature by default because of
`Implies`). See
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c.

Note that the processor definitions do not respect the architecture
DefaultExts. These apply only when specifying `-march=<some architecture
version>`. So when a feature is moved from `Implies` to `DefaultExts` on
the Architecture definition, the feature needs to be added to all
processor definitions (that are based on that architecture) in order to
preserve the existing behaviour. I have checked the TRMs for many cases
(see specific commit messages) but in other cases I have just kept the
current behaviour and not tried to fix it.
2024-08-17 13:36:40 +01:00
Freddy Ye
372842b30f
[X86][MC] Remove CMPCCXADD's CondCode flavor. (#103898)
To align with gas's latest changes.
relate gas patch:
https://sourceware.org/pipermail/binutils/2024-May/134360.html
2024-08-15 14:18:59 +08:00
Brox Chen
afd42fb303
[AMDGPU][True16][CodeGen] Support AND/OR/XOR and LDEXP True16 format (#102620)
Support AND/OR/XOR true16 and LDEXP true/fake16 format.

These instructions are previously implemented with fake16 profile.
Fixing the implementation.

Added a RA hint so that when using 16bit register in a 32bit
instruction, try to use the register directly without an extra 16bit
move

---------

Co-authored-by: guochen2 <guochen2@amd.com>
2024-08-13 12:23:39 -04:00
Freddy Ye
bfbd4cc88c
[X86,MC,test] Add enc/dec tests for ccmpbe (#102883)
This is also pre-commit test for #102284
2024-08-13 08:05:48 +08:00
Jonathan Thackray
a918ffefb1
[AArch64] Implement TRBMPAM_EL1 system register (#102485)
Implement TRBMPAM_EL1 system register, which was noticed to be missing
2024-08-09 11:59:40 +01:00
Freddy Ye
80721e0d6c
[X86][AVX10.2] Support AVX10.2-SATCVT new instructions. (#101599)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-06 19:37:49 +08:00
Phoebe Wang
b0329206db
[X86][AVX10.2] Support AVX10.2 VNNI FP16/INT8/INT16 new instructions (#101783)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-05 18:57:42 +08:00
Freddy Ye
3d5cc7e1e6
[X86][AVX10.2] Support AVX10.2-MINMAX new instructions. (#101598)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-05 11:06:02 +08:00
Phoebe Wang
0dba5381d8
[X86][AVX10.2] Support YMM rounding new instructions (#101825)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-04 21:05:45 +08:00
Phoebe Wang
259ca9ee9c
Reland "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)" (#101616)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-03 09:26:07 +08:00
Phoebe Wang
2e0588d5e1
Revert "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions" (#101612)
Reverts llvm/llvm-project#101452

There are several buildbot failed. Revert first.
2024-08-02 13:04:10 +08:00
Phoebe Wang
10bad2c8d7
[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-02 12:10:50 +08:00
Brox Chen
ab91371653
[AMDGPU][True16][MC] Support v_swap_b16. (#100442)
support V_SWAP_B16 true16 encoding in asm/disasm for GFX11/12

Co-authored-by: guochen2 <guochen2@amd.com>
2024-08-01 22:08:45 +04:00
Ivan Kosarev
e1052faaf8
[AMDGPU][MC][NFC] Drop remaining -wavesize32/64 attributes in tests. (#100339)
Those are not needed now that
<https://github.com/llvm/llvm-project/pull/98400> is submitted.
2024-07-24 13:41:00 +01:00
Ivan Kosarev
09fec46882
[AMDGPU][RFC] Combine asm and disasm tests. (#92895)
Eliminates the need to replicate the same instructions in MC and
MC/Disassembler tests and synchronize changes in them. Also highlights
differences between disassembled, reassembled and original instructions.
2024-07-19 17:05:26 +01:00
Stanislav Mekhanoshin
bccd1190c9
[AMDGPU] Get rid of the +wavefrontsizeXX,-wavefrontsizeXX in MC tests (#99001)
Now turning off/turning on feature is not needed in most cases. This is
NFC, tests only.
2024-07-16 12:44:43 -07:00
Stanislav Mekhanoshin
b132dd41eb
[AMDGPU] Remove wavefrontsize feature from GFX10+ (#98400)
Processor definition shall not include a default feature which may be
switched off by a different wave size. This allows not to write
-mattr=-wavefrontsize32,+wavefrontsize64 in tests.
2024-07-16 01:02:25 -07:00
Dominik Steenken
9f4a25e2a7
Add extended mnemonics (#97571)
This PR adds a number of thus-far missing extended mnemonics to the
assembler and disassembler for SystemZ.

The following mnemonics have been added and are supported for the
assembler and disassembler:

- `NOP(R)?`
- `LFI`
- `RISBG(N)?Z`

The following mnemonics have been added and are supported for the
assembler only:

- `JC(TH)?`
- `LLG(F|H)I`
- `NOT(G)?R`
2024-07-15 10:39:23 +02:00
Jon Roelofs
c66e1d6f34
[llvm][AArch64] apple-m4 is armv9.2-a (#98267)
But since SVE and friends have been added to the default extensions
list, and every CPU was opted into those extensions by default, we
couldn't correctly announce its architecutral version to the backend.
Additionally, we FEAT_MEC from llvm's "required" list for v9.0 to the
optional list for v9.2, as the spec considers it optional, and M4 does
not implement it. Similarly, fixes up several bugs w.r.t. FEAT_RME.

As a drive-by, I noticed that saphira did not have an
AArch64CPUTestParams entry, and thus added one.
2024-07-11 07:46:51 -07:00
Jack Styles
7868033d2e
[AArch64] Update AUTIxSPPC and RETAxSPPC instructions for registers (#98303)
As of the 2024.06 Arm Architecture release, the register variants of the
AUTIxSPPC and RETAxSPPC instructions have been updated to be explicitly
different to the immediate variant. The instructions now follow the
format AUTIxSPPCR and RETAxSPPCR for the register variants, with the
immediate variants keeping their current form.

The Specs can be found at the following locations
AUTIASPPCR:
https://developer.arm.com/documentation/ddi0602/2024-06/Base-Instructions/AUTIASPPCR--Authenticate-return-address-using-key-A--using-a-register-?lang=en
AUTIBSPPCR:
https://developer.arm.com/documentation/ddi0602/2024-06/Base-Instructions/AUTIBSPPCR--Authenticate-return-address-using-key-B--using-a-register-?lang=en
RETAASPPCR and RETABSPPCR:
https://developer.arm.com/documentation/ddi0602/2024-06/Base-Instructions/RETAASPPCR--RETABSPPCR--Return-from-subroutine--with-enhanced-pointer-authentication-return-using-a-register-?lang=en
2024-07-11 07:58:20 +01:00
Ivan Kosarev
2b6e3f3f90
[AMDGPU] Fix MC/Disassembler/AMDGPU/decode-err.txt. (#96621)
It fails downstream now that
https://github.com/llvm/llvm-project/pull/95237 removed flushing the
output stream on printing every instruction.
2024-06-27 15:32:24 +01:00
Mariusz Sikora
fbf0ca6418
[AMDGPU][GFX12] Add support for new block ls instructions (#96273)
Add MC layer support for new instructions:

GLOBAL_LOAD_BLOCK
GLOBAL_STORE_BLOCK
SCRATCH_LOAD_BLOCK
SCRATCH_STORE_BLOCK

Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
2024-06-21 20:12:18 +02:00
Jay Foad
70748dcbe0 [AMDGPU] Fix GFX90A/GFX940 check prefix typos 2024-06-20 10:14:42 +01:00
Jay Foad
81e8f01b55 [AMDGPU] Fix typo "GXF" in check prefix 2024-06-20 10:01:13 +01:00
Koakuma
edd2d7c558
[NFC][SPARC] Fix typos and style mismatches
Fix style errors accidentally introduced in PRs #87259 and #94245.

Reviewers: rorth, jrtc27, brad0, s-barannikov

Reviewed By: s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/96019
2024-06-19 21:44:48 +07:00
Ivan Kosarev
162386693f
[AMDGPU][MC] Support UC_VERSION_* constants. (#95618)
Our other tools support them, so we want them in LLVM
assembler/disassembler too.
2024-06-18 15:44:14 +01:00
Shengchen Kan
91a55cf5ad [X86][MC] Not decode 0xf3 as rep prefix if it's right before REX2
This fixes https://github.com/llvm/llvm-project/issues/95412
2024-06-13 23:26:33 +08:00
Shengchen Kan
3d35b94e3a [X86][test] Pre-commit tests for https://github.com/llvm/llvm-project/issues/95412 2024-06-13 22:54:22 +08:00
Ivan Kosarev
9890f94343
[AMDGPU][GFX12] Support disassembling MUBUF instructions with arbitrary FORMAT values. (#95243)
Some tools generate such instructions with the FORMAT field set to 0,
which corresponds to buf_fmt_invalid, but that should not prevent them
from being recognised on decoding.
2024-06-13 08:16:06 +01:00
Koakuma
41f2ea0b0f
[SPARC][IAS] Add support for prefetcha instruction
This adds support for `prefetcha` instruction for prefetching from
alternate address spaces.

Reviewers: jrtc27, brad0, rorth, s-barannikov

Reviewed By: s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/94250
2024-06-09 22:13:31 +07:00
Koakuma
2388129d48
[SPARC][IAS] Add named prefetch tag constants
This adds named tag constants (such as `#one_write` and `#one_read`)
for the prefetch instruction.

Reviewers: jrtc27, rorth, brad0, s-barannikov

Reviewed By: s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/94249
2024-06-09 22:09:14 +07:00
Jonathan Thackray
917afa8832
[ARM] Add support for Cortex-R52+ (#94633)
Cortex-R52+ is an Armv8-R AArch32 CPU.

Technical Reference Manual for Cortex-R52+:
   https://developer.arm.com/documentation/102199/latest/
2024-06-07 11:03:32 +01:00
Jun Wang
6e7b45c55b
[AMDGPU][MC] Support tfe operand in image_atomic instructions (#92469)
Current, if an image_atomic instruction has the 'tfe' operand, the
llvm-mc assembler in general would reject it. The only exception is when
dmask is 0x1 and the instruction is not image_atomic_cmpswap (e.g.,
image_atomic_add v[5:6], v252, s[8:15] dmask:0x1 tfe). This patch fixes
this problem and allows tfe to be specified in image_atomic
instructions.

---------

Co-authored-by: Jun Wang <jun.wang7@amd.com>
2024-05-29 15:55:58 -07:00
Jay Foad
fbe98da623 [AMDGPU] Fix filecheck annotation typos
Co-authored-by: klensy <nightouser@gmail.com>
2024-05-29 15:01:48 +01:00