llvm-project

Author	SHA1	Message	Date
Dmitry Preobrazhensky	e7a306310b	[AMDGPU][GFX11] Correct tied src2 of v_fmac_f16_e64 src2 was incorrectly defined as VSrc_f16 but it is tied to dst which is VGPR_32. As a result, disassembler failed to decode src2. Differential Revision: https://reviews.llvm.org/D140299	2022-12-30 16:42:15 +03:00
Dmitry Preobrazhensky	9f40d9ffd1	[AMDGPU][MC][GFX11] Correct encoding of neg modifier for v_dot2_f32_bf16 Fix a bug with neg_lo:[0,1,0] and neg_hi:[0,1,0] modifiers - they are accepted but not encoded. Differential Revision: https://reviews.llvm.org/D140470	2022-12-30 16:25:22 +03:00
Lei Huang	7a7e9109a2	[PowerPC] Implement P10 Byte Reverse Insructions Generate brh, brw and brd instructions for byte-swap operations on P10 and generating a single instruction for a 32-bit swap followed by a 16-bit right shift. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D140414	2022-12-21 09:15:57 -06:00
Lucas Prates	d73cfc6710	[AArch64] Add missing v8.8a Non-maskable Interrupts feature This adds support for the missing Non-maskable Interrupts (FEAT_NMI) feature from armv8.8-A, which consists of the `ALLINT` pstate register. This is a second iteration of the patch from D131389, building on top of the D139925 changes that enable better support for `msr (immediate)` instructions that take 1-bit immediates. Contributors: * David Candler * Tomas Matheson * Sam Elliott Reviewed By: lenary, tmatheson Differential Revision: https://reviews.llvm.org/D140216	2022-12-19 15:08:30 +00:00
Lucas Prates	f516e91715	[AArch64] Add new v9.4-A PM pstate system register This adds support for the new PM pstate system register introduced by the v9.4-A Exception-based Event Profiling extension (FEAT_EBEP). The new PM pstate register takes a 1-bit immediate and requires different values to be specified for the higher bits of the Crm field. To enable that, this patch creates an explicit separation between the pstate system registers that take 4-bit and 1-bit immediate operands, allowing each entry to specify the value for the 3 high bits of Crm. This also updates other pstate registers to correctly accept 4-bit immediates, matching their decoding specification from the Arm ARM. These include: `PAN`, `UAO`, `DIT` and `SSBS`. More information about this extension and the new register can be found at: * https://developer.arm.com/documentation/ddi0601/2022-09/AArch64-Registers/PM--PMU-Exception-Mask Contributors: * Lucas Prates * Sam Elliott Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139925	2022-12-19 15:07:52 +00:00
Archibald Elliott	947d4fb373	[AArch64] RASv2 Assembly Support This feature adds upstream support for FEAT_RASv2 and FEAT_PFAR. Both are system-register-only, but FEAT_RAS is behind the command-line extension "+ras", so FEAT_RASv2 is behind "+rasv2". This patch includes support for ID_AA64MMFR4_EL1. This is an ID system register so it is not behind any feature flags. Differential Revision: https://reviews.llvm.org/D139936	2022-12-16 14:37:35 +00:00
Petar Avramovic	cc6b10d1ee	AMDGPU: Check if operand RC contains register used when printing Disassembler can successfully decode sgpr register when only vgpr registers are valid for the operand (e.g. VReg_* and VISrc_* operands). In InstPrinter, detect when operand register class does not contain register that is being printed. Does not result in an error. Intended use is for disassembler tests. Differential Revision: https://reviews.llvm.org/D139646	2022-12-09 17:55:57 +01:00
Petar Avramovic	a1ceacd050	AMDGPU: Precommit wmma tests for D139646	2022-12-09 17:55:56 +01:00
Lucas Prates	2050e7ebe1	[Arm][AArch64] Add support for v8.9-A/v9.4-A base extensions This implements the base extensions that are part of the v8.9-A and v9.4-A architecture versions, including: * The Clear BHB Instruction (FEAT_CLRBHB) * The Speculation Restriction Instruction (FEAT_SPECRES2) * The SLC target for the PRFM instruction * New system registers: * ID_AA64PFR2_EL1 * ID_AA64MMFR3_EL1 * HFGITR2_EL2 * SCTLR2_EL3 More information on the new extensions can be found on: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 * https://developer.arm.com/downloads/-/exploration-tools Contributors: Sam Elliott, Tomas Matheson and Son Tuan Vu. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D139424	2022-12-08 10:15:29 +00:00
James Y Knight	099001979f	[SPARC] Simplify instruction decoder. After https://reviews.llvm.org/D137653 named sub-operands can be used in the auto-generated instruction decoders. This allows the auto-generated decoders to work properly, so all the hand-coded decoders in the sparc target can be removed. In some instances, a manually-written decoder had not been implemented for an instruction, and thus that instruction was not decoded properly. These have been fixed (and tests added). Differential Revision: https://reviews.llvm.org/D137727	2022-12-07 14:37:08 -05:00
Joe Nash	bbfbec94b1	[AMDGPU] Enable OMod on more VOP3 instructions OMod was disabled if OpSel was enabled, but that restriction is more specific than necessary. Any VOP3 with float operands can use OMod. On GFX11, FMAC_F16_e64 can use op_sel. Previously, SIFoldOperands and convertToThreeAddress were accidentally correct when they reinterpreted the zero OMod operand on V_FMAC_F16_e64 as the OpSel operand on V_FMA_F16_gfx9_e64. Now we explicitly add op_sel if required. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139469	2022-12-07 13:30:33 -05:00
Jay Foad	b9f3977b26	[AMDGPU] Add MC tests for s_endpgm's optional immediate operand Differential Revision: https://reviews.llvm.org/D139438	2022-12-06 17:04:50 +00:00
Jay Foad	c634e1a28a	[AMDGPU] Remove FIXME that was addressed by D99413	2022-12-06 14:50:47 +00:00
Kristina Bessonova	4e958b4d7c	[llvm-objdump] Avoid using mapping symbols as branch target labels The main motivation for this change is to avoid ambiguity because mapping symbol names may not be unique across a binary and do not allow uniquely identifying target address. So that mapping symbols used as branch target labels make llvm-objdump output less readable. Another point is that mapping symbols sometimes appear in non-allocatable sections, like debug info sections which make objdump output even more confusing. For example, a small AArch64 executable may contain plenty of `$d[.*]` symbols and none of them would be useful as a label for resolving a branch or a memory operand target address: ``` 0000000000000254 l .note.ABI-tag 0000000000000000 $d 00000000000008d4 l .eh_frame 0000000000000000 $d 0000000000000868 l .rodata 0000000000000000 $d 0000000000011028 l .data 0000000000000000 $d 0000000000010db8 l .fini_array 0000000000000000 $d 0000000000010db0 l .init_array 0000000000000000 $d 00000000000008e8 l .eh_frame 0000000000000000 $d 0000000000011034 l .bss 0000000000000000 $d ``` Note that GNU objdump doesn't use mapping symbols as branch target labels for all targets that support such symbols (ARM, AArch64, CSKY). Differential Revision: https://reviews.llvm.org/D139131	2022-12-06 12:19:12 +02:00
Maryam Moghadas	c19f905fed	[PowerPC] Implement xscmpeqqp, xscmpgeqp, xscmpgtqp instructions This patch adds 3 Power10 VSX Scalar compare for quad precision instructions including xscmpeqqp, xscmpgeqp, xscmpgtqp Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D138592	2022-12-01 15:01:49 -06:00
Tomas Matheson	7fea6f2e0e	[AArch64] Assembly support for VMSA Virtual Memory System Architecture (VMSA) This is part of the 2022 A-Profile Architecture extensions and adds support for the following: - Translation Hardening Extension (FEAT_THE) - 128-bit Page Table Descriptors (FEAT_D128) - 56-bit Virtual Address (FEAT_LVA3) - Support for 128-bit System Registers (FEAT_SYSREG128) - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128) - 128-bit Atomic Instructions (FEAT_LSE128) - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE) - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE) - Memory Attribute Index Enhancement (FEAT_AIE) New instructions added: - FEAT_SYSREG128 adds MRRS and MSRR. - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases. - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions. - FEAT_THE adds the set of RCW* instructions. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ Contributors: Keith Walker Lucas Prates Sam Elliott Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138920	2022-11-30 13:37:02 +00:00
Mateja Marjanovic	595a08847a	[AMDGPU] Add support for new LLVM vector types Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384. Differential Revision: https://reviews.llvm.org/D138205	2022-11-29 17:02:04 +01:00
Dmitry Preobrazhensky	9b8eb5fa8e	[AMDGPU][MC][GFX11] Correct op_sel handling for permlane*16 Differential Revision: https://reviews.llvm.org/D137969	2022-11-29 18:45:22 +03:00
Dmitry Preobrazhensky	869fc7eabd	[AMDGPU][MC][MI100+] Enable VOP3 variants of dot2c/dot4c/dot8c opcodes Differential Revision: https://reviews.llvm.org/D138494	2022-11-29 17:38:18 +03:00
Simon Tatham	e45cbf9923	[ARM,MVE] Update MVE_VMLA_qr for architecture change. In revision B.q and before of the Armv8-M architecture reference manual, the vector/scalar forms of the `vmla` and `vmlas` instructions came in signed and unsigned integer forms, such as `vmla.s8 q0,q1,r2` or `vmlas.u32 q3,q4,r5`. Revision B.r has changed this. There are no longer signed and unsigned versions of these instructions, since they were functionally identical anyway. Now there is just `vmla.i8` (or `i16` or `i32`, and similarly for `vmlas`). Bit 28 of the instruction encoding, which was previously 0 for signed or 1 for unsigned, is now expected to be 0 always. This change updates LLVM to the new version of the architecture. The obsoleted encodings for unsigned integers are now decoding errors, and only the still-valid encoding is ever emitted. This shouldn't break any existing assembly code, because the old signed and unsigned versions of the mnemonic are still accepted by the assembler (which is standard practice anyway for all signedness-agnostic MVE integer instructions). Reviewed By: dmgreen, lenary Differential Revision: https://reviews.llvm.org/D138827	2022-11-29 08:47:00 +00:00
Tomas Matheson	a6aaa969f7	[AArch64] Assembly support for FEAT_LRCPC3 This patch implements assembly support for the 2022 A-Profile Architecture extension FEAT_LRCPC3. FEAT_LRCPC3 is AArch64 only and introduces new variants of load/store instructions with release consistency ordering. Specs for individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ This feature is optionally available from v8.2a and therefore not enabled by default. Contributors: Lucas Prates Sam Elliot Son Tuan Vu Tomas Matheson Differential Revision: https://reviews.llvm.org/D138579	2022-11-25 18:59:07 +00:00
Archibald Elliott	605e22b7d4	[AArch64] Add Missing System Registers The following system registers have been missing upstream: - ID_DFR1_EL1 - AMCG1IDR_EL0 (present when FEAT_AMUv1p1 implemented - optional from v8.6-a) - HAFGRTR_EL2 (present when FEAT_AMUv1 and FEAT_FGT are implemented) With regards to HAFGRTR_EL2, this is only present when you have both extensions. As FEAT_FGT is part of a later architecture, we group it with those registers. In all honesty, this is a good example of the kinds of place where just enabling all system registers all the time would be easiest. Differential Revision: https://reviews.llvm.org/D138553	2022-11-24 17:47:48 +00:00
Lucas Prates	b0d4045dab	[AArch64] Add support for v8.9-A/v9.4-A Debug and PMU extensions This adds support for the 2022 Debug and PMU extensions that are part of the v8.9-A and v9.4-A architecture versions. This includes: * New archtecture extension for the v9.4-A Instrumentation Extension (FEAT_ITE), including 'trcit' instruction and system registers * New system registers for: * 2022 Debug features (FEAT_Debugv8p9) * 2022 Performance Monitors Extension features (FEAT_PMUv3p9) * PMU Snapshot extension (FEAT_PMUv3_SS) * PMU Fixed-function instruction counter (FEAT_PMUv3_ICNTR) * System Performance Monitors Extension (FEAT_SPMU) * Synchornous-exception-based event profiling (FEAT_SEBEP) * Fine Grained Traps Extension (FEAT_FGT2) * SPE Data Source filtering (FEAT_SPE_FDS) More information on the new extensions can be found on: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 * https://developer.arm.com/downloads/-/exploration-tools Changes by Son Tuan Vu, Sam Elliott and me. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D138556	2022-11-24 16:15:11 +00:00
Ties Stuij	cb261e30fb	[AArch64][clang] implement 2022 General Data-Processing instructions This patch implements the 2022 Architecture General Data-Processing Instructions They include: Common Short Sequence Compression (CSSC) instructions - scalar comparison instructions SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate - ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes) - command-line options for CSSC Associated with these instructions in the documentation is the Range Prefetch Memory (RPRFM) instruction, which signals to the memory system that data memory accesses from a specified range of addresses are likely to occur in the near future. The instruction lies in hint space, and is made unconditional. Specs for the individual instructions can be found here: https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/ contributors to this patch: - Cullen Rhodes - Son Tuan Vu - Mark Murray - Tomas Matheson - Sam Elliott - Ties Stuij Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D138488	2022-11-22 14:23:12 +00:00
Piotr Sobczak	de767db633	[AMDGPU] Add encoding tests for SALU_CYCLE_2/3 Add missing assembler/disassembler tests for INSTID_SALU_CYCLE_2 and INSTID_SALU_CYCLE_3 which are possible arguments in S_DELAY_ALU. Differential Revision: https://reviews.llvm.org/D138482	2022-11-22 11:41:59 +01:00
Maryam Moghadas	bd68070481	[PowerPC] Add new load/store with length instructions to Future CPU. This patch adds 8 news load and store with length instructions including lxvrl, lxvrll, stxvrl, stxvrll, lxvprl, lxvprll, stxvprl, stxvprll. Reviewed By: stefanp, amyk, saghir Differential Revision: https://reviews.llvm.org/D136992	2022-11-21 13:22:27 -06:00
Joe Nash	38f47d90db	[AMDGPU][MC][NFC] Rename VOP3 VOPC test files D136149 and D136148 renamed the MC test files for VOP3 promoted from VOP1 and VOP2 in a consistent way. Do the same for VOP3 coming from VOPC. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D137950	2022-11-14 13:27:38 -05:00
Stefan Pintilie	1ef2a92d66	[PowerPC] Add the SUBFUS instruction to Future CPU. Add a new instruction called SUBUFS that does saturating subtract. This instruction is only for Future CPU. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D137643	2022-11-10 08:32:29 -06:00
Keith Walker	00d98e6572	[AArch64] RME MEC instructions and system registers This patch adds assembler/disassembler support for RME MEC (Memory Encryption Contexts). Cache maintence instructions added: - DC CIPAPA - DC CIGDPAPA System registers added: - MECIDR_EL2 - MECID_P0_EL2 - MECID_A0_EL2 - MECID_P1_EL2 - MECID_A1_EL2 - VMECID_P_EL2 - VMECID_A_EL2 - MECID_RL_A_EL3 Differential Revision: https://reviews.llvm.org/D137431	2022-11-10 14:05:12 +00:00
Sheng	e086b24d15	[M68k] Add support for atomic instructions This adds support for atomic_load, atomic_store, atomic_cmpxchg and atomic_rmw Fixes #48236 Reviewed by: myhsu, efriedma Differential Revision: https://reviews.llvm.org/D136525	2022-11-09 18:37:03 +08:00
Dmitry Preobrazhensky	6e279f5bb6	[AMDGPU][MC][GFX10+] Enable literal operands with permlane16/permlanex16 Differential Revision: https://reviews.llvm.org/D137332	2022-11-07 15:49:21 +03:00
Stefan Pintilie	9df924a634	[PowerPC] Add new DMR register classes to Future CPU. A new register class as well as a number of related subregisters are being added to Future CPU. These registers are Dense Math Registers (DMR) and are 1024 bits long. These regsiters can also be used in consecutive pairs which leads to a register that is 2048 bits. This patch also adds 7 new instructions that use these registers. More instructions will be added in future patches. Reviewed By: amyk, saghir Differential Revision: https://reviews.llvm.org/D136366	2022-11-03 08:29:55 -05:00
Mirko Brkusanin	093200fd00	[AMDGPU][NFC] Split MC tests into promoted from VOP1 to VOP3 and only VOP3 Differential Revision: https://reviews.llvm.org/D136149	2022-11-02 12:30:23 +01:00
Mirko Brkusanin	7e1963b191	[AMDGPU][NFC] Split MC tests into promoted from VOP2 to VOP3 and only VOP3 Differential Revision: https://reviews.llvm.org/D136148	2022-11-02 12:30:23 +01:00
Freddy Ye	aee2a35ac4	[X86] Add AVX-NE-CONVERT instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D135930	2022-10-31 23:39:38 +08:00
David Green	5d67b051e2	[AArch64] Include SME2 in +all The sme2 predicate was as AssemblerPredicate, not AssemblerPredicateWithAll like all the other features, meaning it wasn't included in +all. This fixes that inconsistency, allowing the instructions to be decoded by default. Differential Revision: https://reviews.llvm.org/D137016	2022-10-31 13:04:32 +00:00
Daniel Thornburgh	75cdab6dc2	[llvm-objdump] Add --no-print-imm-hex to tests depending on it. This prepares for an upcoming change to make --print-imm-hex the default behavior of llvm-objdump. These tests were updated in a semi-automatic fashion. See D136972 for details.	2022-10-29 15:40:26 -07:00
Freddy Ye	23f02693ec	[X86] Add AVX-VNNI-INT8 instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135938	2022-10-28 10:39:54 +08:00
Freddy Ye	0e720e6ada	[X86] Add AVX-IFMA instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135932	2022-10-28 09:42:30 +08:00
Phoebe Wang	b51b90d6e2	[X86][1/2] SUPPORT RAO-INT For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Initial authored by Liu Chen (@LiuChen3) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135951	2022-10-27 17:20:07 +08:00
James Y Knight	26fdad031c	[MIPS] Fix useDeprecatedPositionallyEncodedOperands errors. This is a follow-on to https://reviews.llvm.org/D134073. The number of MIPS16 changes here is a bit surprising. Many of the fields with mismatched names were NOT previously choosing the correct argument positionally, but instead doing something completely wrong (e.g. it would encode a register where an immediate was expected). But, machine-code generation for MIPS16 has never actually functioned. It's also fully untested, thus, the MIPS16 changes, despite changing behavior, breaks (and fixes) zero tests. This change does not fix MIPS16 output, but it ought to be at least incrementally less broken. Outside MIPS16, I believe the only functional change is to the 'ginvi' instruction: it was previously encoding garbage into a field which was specified to be '00'. Fortunately, it was covered by tests -- and the tests were testing the incorrect behavior. So, fixed. Differential Revision: https://reviews.llvm.org/D134220	2022-10-26 14:06:08 -04:00
Ulrich Weigand	96482ee434	[SystemZInstPrinter] Introduce markup tags emission SystemZ assembly syntax emission now leverages markup tags, if enabled. Author: Antonio Frighetto Differential Revision: https://reviews.llvm.org/D129868	2022-10-25 18:59:50 +02:00
Freddy Ye	fdac4c4e92	[X86] Add CMPCCXADD instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135933	2022-10-25 14:33:39 +08:00
Xiang1 Zhang	661881d436	[X86] Add AMX-FP16 instructions. Differential Revision: https://reviews.llvm.org/D135941	2022-10-22 08:05:22 +08:00
Phoebe Wang	62ca79102c	[X86][1/2] Support PREFETCHI instructions For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136040	2022-10-20 08:46:01 +08:00
Freddy Ye	3ee58e2f35	[X86] Add WRMSRNS instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135935	2022-10-19 13:04:11 +08:00
Freddy Ye	e3df4ba9d2	[X86] Add MSRLIST instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: skan, RKSimon Differential Revision: https://reviews.llvm.org/D135934	2022-10-19 10:35:42 +08:00
Dmitry Preobrazhensky	bf96703fb3	[AMDGPU][MC][GFX8+] Correct v_cndmask modifiers Correct v_cndmask_b32 to support abs/neg modifiers in dpp/sdwa/e64 variants. Correct v_cndmask_b16 for proper disassembly of abs/neg modifiers in e64_dpp variants. Differential Revision: https://reviews.llvm.org/D135900	2022-10-14 19:37:27 +03:00
Dmitry Preobrazhensky	4e62d02db9	[AMDGPU][MC] Correct image_gather4h Correct encoding of image_gather4h for GFX9; disable this instruction for SI, CI and VI. Differential Revision: https://reviews.llvm.org/D135605	2022-10-11 14:41:27 +03:00
Dmitry Preobrazhensky	8f8e4e3b38	[AMDGPU][MC][GFX11] Correct v_fmac_.*_e64_dpp Differential Revision: https://reviews.llvm.org/D134961	2022-10-07 16:21:55 +03:00

1 2 3 4 5 ...

2073 Commits