llvm-project

Author	SHA1	Message	Date
Shilei Tian	e963d0740e	[AMDGPU] Replace `isInlinableLiteral16` with specific version (#84402 ) The current implementation of `isInlinableLiteral16` assumes, a 16-bit inlinable literal is either an `i16` or a `fp16`. This is not always true because of `bf16`. However, we can't tell `fp16` and `bf16` apart by just looking at the value. This patch splits `isInlinableLiteral16` into three versions, `i16`, `fp16`, `bf16` respectively, and call the corresponding version.	2024-03-08 14:49:52 -05:00
Shilei Tian	e9c1dbb408	Revert "[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345 )" This reverts commit 530f0e64ec11327879c44f2fd55c7c28efdbaa2d because it breaks downstream.	2024-03-06 08:42:54 -05:00
Shilei Tian	530f0e64ec	[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345 )	2024-03-04 08:40:42 -05:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Nikita Popov	bdf2fbba9c	[AMDGPU] Convert some tests to opaque pointers (NFC)	2022-12-19 12:41:13 +01:00
Stanislav Mekhanoshin	381ded345b	[AMDGPU] Add S_MOV_B64_IMM_PSEUDO for wide constants This is to allow 64 bit constant rematerialization. If a constant is split into two separate moves initializing sub0 and sub1 like now RA cannot rematerizalize a 64 bit register. This gives 10-20% uplift in a set of huge apps heavily using double precession math. Fixes: SWDEV-292645 Differential Revision: https://reviews.llvm.org/D104874	2021-06-30 11:45:38 -07:00
Dmitry Preobrazhensky	1c9d681092	[AMDGPU][CODEGEN] Added support of new inline assembler constraints Added support for constraints 'I', 'J', 'B', 'C', 'DA', 'DB'. See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D81651	2020-07-02 17:20:15 +03:00
Dmitry Preobrazhensky	b087b91c91	[AMDGPU][CODEGEN] Added 'A' constraint for inline assembler Summary: 'A' constraint requires an immediate int or fp constant that can be inlined in an instruction encoding. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D78494	2020-05-25 14:23:34 +03:00
Matt Arsenault	f9a42ed0a7	AMDGPU: Relax 32-bit SGPR register class Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This will allow the register coalescer to do a better job eliminating copies to m0. For GlobalISel, as a terrible hack, use SGPR_32 for things that should use SCC until booleans are solved. llvm-svn: 375267	2019-10-18 18:26:37 +00:00
Matt Arsenault	bb0a610599	AMDGPU: Fix capitalized register names in asm constraints This was a workaround a long time ago, but the canonical lower case names work now. llvm-svn: 363459	2019-06-14 21:16:06 +00:00
Daniil Fukalov	c9a098b314	[AMDGPU] Inline asm - added i16, half and i128 types support AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error. Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341, e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable Differential Revision: https://reviews.llvm.org/D44920 llvm-svn: 334301	2018-06-08 16:29:04 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Marek Olsak	79c05871a2	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it suggested as a better solution by Matt llvm-svn: 287942	2016-11-25 17:37:09 +00:00
Marek Olsak	a45dae458d	Revert "AMDGPU: Make m0 unallocatable" This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932	2016-11-25 16:03:15 +00:00
Matt Arsenault	9e5c7b1031	AMDGPU: Make m0 unallocatable m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841	2016-11-24 00:26:40 +00:00
Matt Arsenault	3d463193a9	AMDGPU: Default to using scalar mov to materialize immediate This is the conservatively correct way because it's easy to move or replace a scalar immediate. This was incorrect in the case when the register class wasn't known from the static instruction definition, but still needed to be an SGPR. The main example of this is inlineasm has an SGPR constraint. Also start verifying the register classes of inlineasm operands. llvm-svn: 285762	2016-11-01 22:55:07 +00:00
Matt Arsenault	a609e2d5ce	AMDGPU: Relax SGPR asm constraint register class s should be SReg_32 to be as general as possible. This can avoid a copy from m0. llvm-svn: 280154	2016-08-30 20:50:08 +00:00
Tom Stellard	b3c3bda512	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203	2015-12-10 02:12:53 +00:00

18 Commits