llvm-project

Author	SHA1	Message	Date
Freddy Ye	f4509cf284	[X86][MC] Support enc/dec for SETZUCC and promoted SETCC. (#86473 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-04-11 10:18:29 +08:00
Peter Lafreniere	614a578034	[M68k] Add support for bitwise NOT instruction (#88049 ) Currently the bitwise NOT instruction is not recognized. Add support for using NOT on data registers. This is a partial implementation that puts NOT at the same level of support as NEG currently enjoys. Using not rather than eori cuts the length of the encoded instruction in half or in thirds, leading to a reduction of 4-10 cycles per instruction, on the original 68000. This change includes tests for both bitwise and arithmetic negation.	2024-04-09 09:07:26 -07:00
Joe Nash	e29228efae	[AMDGPU][MC] Allow VOP3C dpp src1 to be imm or SGPR (#87418 ) Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on GFX1150Plus The w32 and w64 _e64_dpp assembler only real instructions were unused, and erroneously constructed in a way that bugged parsing of the new instructions. They are removed. This patch is a follow up to PR https://github.com/llvm/llvm-project/pull/87382	2024-04-03 14:51:27 -04:00
Joe Nash	6a13bbf92f	[AMDGPU][MC] Enables sgpr or imm src1 for float VOP3 DPP, but excludi… (#87382 ) …ng VOPC. Fixes support on GFX1150 and GFX12 where src1 of e64_dpp instructions should allow sgpr and imm operands. PR #67461 added support for this with int operands, but it was missing a piece for float. Changing VOPC e64_dpp will be in a different patch because there is a bug preventing that change.	2024-04-03 11:34:12 -04:00
Cinhi Young	8b859c6e4a	[MIPS] Fix the opcode of max.fmt and mina.fmt (#85609 ) - The opcode of the mina.fmt and max.fmt is documented wrong, the object code compiled from the same assembly with LLVM behaves differently than one compiled with GCC and Binutils. - Modify the opcodes to match Binutils. The actual opcodes are as follows: {5,3} \| bits {2,0} of func \| ... \| 100 \| 101 \| 110 \| 111 -----+-----+-----+-----+-----+----- 010 \| ... \| min \| mina \| max \| maxa	2024-04-03 10:14:02 +08:00
Freddy Ye	db7d243978	[X86][MC] Support enc/dec for IMULZU. (#86653 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-03-29 15:52:41 +08:00
Joe Nash	44278f2326	[AMDGPU][MC] Fix GFX12 check line typo and move test NFC. Fix CHECK lines that seem to have a copy paste error. Move the test that was formerly in gfx12_dasm_vinterp.txt (see #85949).	2024-03-21 10:46:07 -04:00
Joe Nash	d1f182c895	[AMDGPU][MC][True16] Rename and combine VINTERP MC tests (#85949 ) NFC. gfx11_asm_vinterp.s already contained GFX12 run lines. Rename the assembler and disassembler tests to be sorted based on real16 or fake16 instead of gfxip. Note, both GFX11 and GFX12 currently only have fake16 (fake16 in encoding, but not by name) upstream, so that is why the test files have a -fake16 suffix. One test input is changed, and that is the disassembler test for unsupported bits in the instruction. It is now an input that is valid on both GFX11 and GFX12. This was necessary because the size of the opcode field changed.	2024-03-21 10:42:39 -04:00
XinWang10	7b766a6f50	[X86] Support APX CMOV/CFCMOV instructions (#82592 ) This patch support ND CMOV instructions and CFCMOV instructions. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-03-17 20:18:56 +08:00
Stanislav Mekhanoshin	0b0e52836d	[AMDGPU] Fix GFX11 sendmsg codes (#85299 ) The code MSG_RTN_GET_TBA_TO_PC was missing, and the next code is off by 1 as a result.	2024-03-15 09:46:58 -07:00
Jay Foad	36dece0013	[AMDGPU] Add missing GFX10 buffer format d16 hi instructions (#84809 )	2024-03-12 08:20:08 +00:00
Shengchen Kan	71590e7d1e	[X86][test] Add missing enc/dec tests for CTEST These tests were accidentally missed in #83863	2024-03-12 13:11:16 +08:00
Jay Foad	212604698c	[AMDGPU] Add missing tests for GFX10 (t)buffer format d16 instructions (#84789 )	2024-03-11 18:25:49 +00:00
Sivan Shani	5e688f0dbd	[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83. T1 allow for an optional registers list, the register list must be {d0-d15}. T2 define a mandatory register list, the register list must be {d0-d31}. The requirements for T1/T2 are as follows: T1 T2 Require: v8-M.Main, v8.1-M.Main, secure state secure state 16 D Regs valid valid 32 D Regs UNDEFINED valid No D Regs NOP NOP	2024-03-11 14:27:28 +00:00
Shengchen Kan	1ca8092e87	[X86][MC] Support encoding/decoding for APX CCMP/CTEST (#83863 ) APX assembly syntax recommendations: https://cdrdv2.intel.com/v1/dl/getContent/817241 NOTE: The change in llvm/tools/llvm-exegesis/lib/X86/Target.cpp is for test LLVM :: tools/llvm-exegesis/X86/latency/latency-SETCCr-cond-codes-sweep.s For `SETcc`, llvm-exegesis would randomly choose 1 other instruction to test with `SETcc`, after selecting the instruction, llvm-exegesis would check if the operand is initialized and valid, if not `randomizeTargetMCOperand` would choose a value for invalid operand, it misses support for condition code operand, which cause the flaky failure after `CCMP` supported. llvm-exegesis can choose `CCMP` without specifying ccmp feature b/c it use `MCSubtarget` and only16/32/64 bit is considered. llvm-exegesis doesn't choose other instructions b/c requirement in `hasAliasingRegistersThrough`: the instruction should use GPR (defined by `SETcc`) and define `EFLAGS` (used by `SETcc`).	2024-03-08 20:54:33 +08:00
Joe Nash	3b1512c477	[AMDGPU] Make gfx11 vop2 disassembler tests use strict-whitespace NFC. Adds -strict-whitespace to RUN lines and adjusts CHECK line space padding accordingly. See also (#84078)	2024-03-06 16:07:30 -05:00
Joe Nash	f448b8ec03	[AMDGPU] Make gfx11 vop1 disassembler tests use strict-whitespace (#84078 ) NFC. The whitespace needs to be consistently formatted in some manner. Might as well use -strict-whitespace as the standard. Adds -strict-whitespace to RUN lines and adjust CHECK line space padding accordingly. Also test REAL16 and FAKE16 CHECK lines with wave64.	2024-03-06 10:11:10 -05:00
Ivan Kosarev	a888f5e4d7	[AMDGPU][NFC] Update tests to use -triple= instead of -arch=. (#84153 )	2024-03-06 12:44:19 +00:00
Tomas Matheson	03420f570e	Revert "[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm (#83116 )" This reverts commit 634b0243b8f7acc85af4f16b70e91d86ded4dc83. Failing EXPENSIVE_CHECKS builds with "undefined physical register".	2024-02-29 09:48:29 +00:00
XinWang10	ffa48f0c94	[X86][MC] Teach disassembler to recognize apx instructions which ignores W bit (#82747 ) Extended VMX instructions and 8 bit apx extended instructions don't need W bit, they are marked as W ignored in spec. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-02-29 11:44:41 +08:00
SivanShani-Arm	634b0243b8	[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm (#83116 ) T1 allows for an optional registers list, the register list must be {d0-d15}. T2 defines a mandatory register list, the register list must be {d0-d31}. The requirements for T1/T2 are as follows: T1 T2 Require: v8-M.Main, v8.1-M.Main, secure state secure state 16 D Regs valid valid 32 D Regs UNDEFINED valid No D Regs NOP NOP	2024-02-28 17:02:51 +00:00
Timothy Herchen	ae91a427ac	[X86][MC] Reject out-of-range control and debug registers encoded with APX (#82584 ) Fixes #82557. APX specification states that the high bits found in REX2 used to encode GPRs can also be used to encode control and debug registers, although all of them will #UD. Therefore, when disassembling we reject attempts to create control or debug registers with a value of 16 or more. See page 22 of the [specification](https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html): > Note that the R, X and B register identifiers can also address non-GPR register types, such as vector registers, control registers and debug registers. When any of them does, the highest-order bits REX2.R4, REX2.X4 or REX2.B4 are generally ignored, except when the register being addressed is a control or debug register. [...] The exception is that REX2.R4 and REX2.R3 [sic] are not ignored when the R register identifier addresses a control or debug register. Furthermore, if any attempt is made to access a non-existent control register (CR) or debug register (DR) using the REX2 prefix and one of the following instructions: “MOV CR, r64”, “MOV r64, CR”, “MOV DR, r64”, “MOV r64, DR”. #UD is raised. The invalid encodings are 64-bit only because `0xd5` is a valid instruction in 32-bit mode.	2024-02-23 12:49:05 -08:00
Stanislav Mekhanoshin	3dfca24dda	[AMDGPU] Fix encoding of VOP3P dpp on GFX11 and GFX12 (#82710 ) The bug affects dpp forms of v_dot2_f32_f16. The encoding does not match SP3 and does not set op_sel_hi bits properly.	2024-02-23 03:50:00 -08:00
John Brawn	48101edc8d	[AArch64] Fix syntax of gcsstr and gcssttr instructions (#82385 ) The address register should be surrounded by square brackets, like in all the other str instructions. Fixes https://github.com/llvm/llvm-project/issues/81846	2024-02-21 10:05:50 +00:00
Stanislav Mekhanoshin	98db8d0cb7	[AMDGPU] Fix v_dot2_f16_f16/v_dot2_bf16_bf16 operands (#82423 ) src0 and src1 are packed f16/bf16, we are printing literals like 0x40002000, but we cannot parse it.	2024-02-20 16:34:40 -08:00
Shilei Tian	2ad43fa467	[AMDGPU] Fix operand types for `V_DOT2_F32_BF16` (#82044 )	2024-02-20 08:25:01 -05:00
Stanislav Mekhanoshin	13e64958a0	[AMDGPU] Fix decoder for BF16 inline constants (#82276 ) Fix #82039.	2024-02-19 13:45:23 -08:00
Ivan Kosarev	0ec524b120	[AMDGPU][MC][True16] Support V_RCP/SQRT/RSQ/LOG/EXP_F16. (#81131 ) [AMDGPU][MC][True16] Support V_RCP/SQRT/RSQ/LOG/EXP_F16. Also add missing v_ceil/floor_f16 tests. Includes https://github.com/llvm/llvm-project/pull/80892.	2024-02-19 15:50:48 +00:00
Shilei Tian	46734aa1e5	[AMDGPU] Use `bf16` instead of `i16` for bfloat (#80908 ) Currently we generally use `i16` to represent `bf16` in those tablegen files. This patch is trying to use `bf16` directly. Fix #79369.	2024-02-16 15:58:30 -05:00
Jay Foad	cb8f910035	[AMDGPU] Do not test both wave sizes for DSDIR disassembly (#81719 ) There is nothing in these instruction definitions that depends on wave size so testing both seems like overkill. The corresponding assembler tests do not do it.	2024-02-14 10:15:06 +00:00
Konstantin Zhuravlyov	cf55e61dd9	AMDGPU: Don't allow s_barrier on gfx12 (#81317 ) - s_barrier is not present on gfx12	2024-02-12 11:32:46 -05:00
Philipp Tomsich	fbba818a78	[AArch64] Add the Ampere1B core (#81297 ) The Ampere1B is Ampere's third-generation core implementing a superscalar, out-of-order microarchitecture with nested virtualization, speculative side-channel mitigation and architectural support for defense against ROP/JOP style software attacks. Ampere1B is an ARMv8.7+ implementation, adding support for the FEAT WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions. It also includes all features of the second-generation Ampere1A, such as the Memory Tagging Extension and SM3/SM4 cryptography instructions.	2024-02-09 15:22:09 -08:00
Ivan Kosarev	7d19dc50de	[AMDGPU][True16] Support VOP3 source DPP operands. (#80892 )	2024-02-08 16:23:00 +00:00
XinWang10	d9e875dcc1	[X86][MC] Support encoding/decoding for APX variant LZCNT/TZCNT/POPCNT instructions (#79954 ) Two variants: promoted legacy, NF (no flags update). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-31 21:10:02 +08:00
Rageking8	5f4c89edd0	Fix unsigned typos (#76670 )	2024-01-27 22:20:08 -08:00
Shengchen Kan	8a4cb7b607	[X86][test] Add MRM7r/MRM7m entries in evex format enc/dec tests	2024-01-26 17:22:07 +08:00
XinWang10	02d56801ee	[X86] Support APX promoted RAO-INT and MOVBE instructions (#77431 ) R16-R31 was added into GPRs in https://github.com/llvm/llvm-project/pull/70958, This patch supports the promoted RAO-INT and MOVBE instructions in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-01-26 14:33:45 +08:00
XinWang10	6d0080b5de	[X86] Support promoted ENQCMD, KEYLOCKER and USERMSR (#77293 ) R16-R31 was added into GPRs in https://github.com/llvm/llvm-project/pull/70958, This patch supports the promoted ENQCMD, KEYLOCKER and USER-MSR instructions in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-01-26 14:24:43 +08:00
XinWang10	816cc9d24b	[X86][MC] Support Enc/Dec for NF BMI instructions (#76709 ) Promoted BMI instructions were supported in #73899	2024-01-25 10:33:14 +08:00
ostannard	5469010ba7	[AArch64] FP/SIMD is not mandatory for v8-R (#79004 ) The FP/SIMD instructions are optional for v8-R, so they should not be marked as a dependency of HasV8_0rOps. This had the effect of disabling some v8R-specific system registers when any of these features was disabled. I've moved these features to be enabled by default for Cortex-R82 (currently the only v8-R AArch64 core), matching the previous behavior, and clang's default. Based on a patch by Simi Pallipurath <simi.pallipurath@arm.com>	2024-01-24 13:12:03 +00:00
Mirko Brkušanin	7fdf608cef	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-24 13:43:07 +01:00
Mariusz Sikora	cfddb59be2	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414 ) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>	2024-01-24 12:21:15 +01:00
Ivan Kosarev	5a458767dd	[AMDGPU][True16] Support source DPP operands. (#79025 )	2024-01-23 09:52:49 +00:00
Shengchen Kan	5c68c6d70f	[X86] Support encoding/decoding and lowering for APX variant SHL/SHR/SAR/ROL/ROR/RCL/RCR/SHLD/SHRD (#78853 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-23 10:23:27 +08:00
Stanislav Mekhanoshin	1000cefc04	[AMDGPU] Remove s_set_inst_prefetch_distance support from GFX12 (#78786 ) This instruction is not supported by GFX12.	2024-01-22 14:31:17 -08:00
XinWang10	d3cd1ce6ab	[X86] Add lowering tests for promoted CMPCCXADD and update CC representation (#78685 ) https://github.com/llvm/llvm-project/pull/76125 supported the enc/dec for CMPCCXADD instructions, this patch 1. Add lowering test for promoted CMPCCXADD 2. Update the representation of condition code for promoted CMPCCXADD to align with the existing one	2024-01-22 11:32:03 +08:00
Mariusz Sikora	2c78f3b860	[AMDGPU][GFX12] Add tests for flat_atomic_pk (#78683 )	2024-01-19 12:08:17 +01:00
XinWang10	d124b02242	[X86][MC] Fix wrong encoding of promoted BMI instructions due to missing NoCD8 (#78386 ) Address review comments in #76709 Add `NoCD8` to class `ITy`, and rewrite the promoted instructions with `ITy` to avoid unexpected incorrect encoding about `NoCD8`.	2024-01-19 00:27:16 +08:00
Piotr Sobczak	57f6a3f7ea	[AMDGPU] Add global_load_tr for GFX12 (#77772 ) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic int_amdgcn_global_load_tr * Clang builtins amdgcn_global_load_tr*	2024-01-18 15:14:42 +01:00
Mariusz Sikora	3e6589f21c	[AMDGPU][GFX12] Add 16 bit atomic fadd instructions (#75917 ) - image_atomic_pk_add_f16 - image_atomic_pk_add_bf16 - ds_pk_add_bf16 - ds_pk_add_f16 - ds_pk_add_rtn_bf16 - ds_pk_add_rtn_f16 - flat_atomic_pk_add_f16 - flat_atomic_pk_add_bf16 - global_atomic_pk_add_f16 - global_atomic_pk_add_bf16 - buffer_atomic_pk_add_f16 - buffer_atomic_pk_add_bf16	2024-01-18 14:01:09 +01:00

1 2 3 4 5 ...

2274 Commits