llvm-project

Author	SHA1	Message	Date
Jay Foad	d748c81218	[AMDGPU] Change the immediate operand of s_waitcnt_depctr / s_wait_alu (#169378 ) The 16-bit immediate operand of s_waitcnt_depctr / s_wait_alu has some unused bits. Previously codegen would set these bits to 1, but setting them to 0 matches the SP3 assembler behaviour better, which in turn means that we can print them using the human readable SP3 syntax: s_wait_alu 0xfffd ; unused bits set to 1 s_wait_alu 0xff9d ; unused bits set to 0 s_wait_alu depctr_va_vcc(0) ; unused bits set to 0, human readable Note that the set of unused bits changed between GFX10.1 and GFX10.3.	2025-11-25 11:55:26 +00:00
Christudasan Devadasan	a2dc4e02e7	[AMDGPU] Enable multi-group xnack replay in hardware (GFX1250) (#169016 ) This patch enables the multi-group xnack replay mode by configuring the hardware MODE register at kernel entry. This aligns the hardware behavior with the compiler's existing multi-group s_wait_xcnt insertion logic.	2025-11-21 19:42:17 +05:30
Ivan Kosarev	9e55d81c68	[AMDGPU][AsmParser] Introduce MC representation for lit() and lit64(). (#160316 ) And rework the lit64() support to use it. The rules for when to add lit64() can be simplified and improved. In this change, however, we just follow the existing conventions on the assembler and disassembler sides. In codegen we do not (and normally should not need to) add explicit lit() and lit64() modifiers, so the codegen tests lose them. The change is an NFCI otherwise. Simplifies printing operands.	2025-09-24 12:35:50 +01:00
Stanislav Mekhanoshin	2346968807	[AMDGPU] Add V_ADD\|SUB\|MUL_U64 gfx1250 opcodes (#150291 )	2025-07-23 13:17:56 -07:00
Stanislav Mekhanoshin	a32040e483	[AMDGPU] Use 64-bit literals in codegen on gfx1250 (#148727 )	2025-07-14 15:47:24 -07:00
Jay Foad	4e70720139	[AMDGPU] Add some gfx1200 test coverage	2024-06-27 14:53:59 +01:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Jay Foad	0d40831765	[AMDGPU] Allow folding to FMAAK with SGPR and immediate operand on GFX10+ (#72266 ) Allow foldImmediate to create instructions like: v_fmaak_f32 v0, s0, v0, 0x42000000 This instruction has two "scalar values": s0 and 0x42000000. On GFX10+ this is allowed. This fold was originally implemented before the compiler supported GFX10, when all ASICs were limited to one scalar value.	2023-11-28 14:36:37 +00:00
Mirko Brkusanin	a657deb42e	[AMDGPU] Update RUN line in test (NFC)	2023-09-22 12:41:54 +02:00
Mirko Brkušanin	ecfdc23dd2	[AMDGPU] Select gfx1150 SALU Float instructions (#66885 )	2023-09-21 12:22:55 +02:00
Mirko Brkusanin	1e5359c6ba	[AMDGPU] Treat KIMM32 and KIMM16 operand types as noninlinable While they are represent 32/16 bit immediate values they are already included in encoding of the instructions that use them and are not true literals. FMAMK and FMAAK instructions that use them are marked with fixed size so getInstSizeInBytes will not increase the size for these operands. We also add tests whose logic relies on KIMM16 and KIMM32 being considered not inlinable. Differential Revision: https://reviews.llvm.org/D157624	2023-08-11 18:46:39 +02:00
Matt Arsenault	4b1702e87a	AMDGPU: Fix counting source modifiers as literal constants This fixes over estimating code size. This was broken by 79f52af4cd9a76485dd50bcdbb5d393eb7a70103. https://reviews.llvm.org/D157103	2023-08-07 18:40:16 -04:00

12 Commits