llvm-project

Author	SHA1	Message	Date
Diana Picus	bc6955f18c	[AMDGPU] Don't fix the scavenge slot at offset 0 (#79136 ) At the moment, the emergency spill slot is a fixed object for entry functions and chain functions, and a regular stack object otherwise. This patch adopts the latter behaviour for entry/chain functions too. It seems this was always the intention [1] and it will also save us a bit of stack space in cases where the first stack object has a large alignment. [1] `34c8b835b1`	2024-02-09 09:20:25 +01:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
pvanhout	89e91e4c0c	[AMDGPU] Remove post-PromoteAlloca SROA run PromoteAlloca now uses SSAUpdater, it doesn't need SROA to clean-up after it anymore. Internal testing shows no noticeable performance impact. Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D156398	2023-08-11 08:29:21 +02:00
Nikita Popov	bdf2fbba9c	[AMDGPU] Convert some tests to opaque pointers (NFC)	2022-12-19 12:41:13 +01:00
Tony	2f499b9aff	[AMDGPU] Add volatile support to SIMemoryLegalizer Treat a non-atomic volatile load and store as a relaxed atomic at system scope for the address spaces accessed. This will ensure all relevant caches will be bypassed. A volatile atomic is not changed and still only bypasses caches upto the level specified by the SyncScope operand. Differential Revision: https://reviews.llvm.org/D94214	2021-01-09 00:52:33 +00:00
Scott Linder	60b1967c39	[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us to removes the scratch wave offset register from the calling convention ABI. As part of this change, allow the use of an inline constant zero for the SOffset of MUBUF instructions accessing the stack in entry functions when a frame pointer is not requested/required. Entry functions with calls still need to set up the calling convention ABI stack pointer register, and reference it in order to address arguments of called functions. The ABI stack pointer register remains unswizzled, but is now wave-relative instead of queue-relative. Non-entry functions also use an inline constant zero SOffset for wave-relative scratch access, but continue to use the stack and frame pointers as before. When the stack or frame pointer is converted to a swizzled offset it is now scaled directly, as the scratch wave offset no longer needs to be subtracted first. Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling convention. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75138	2020-03-19 15:35:16 -04:00
Yaxun Liu	2a22c5deff	[AMDGPU] Switch to the new addr space mapping by default This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101	2018-02-02 16:07:16 +00:00
Matt Arsenault	3dbeefa978	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Matt Arsenault	707780b420	AMDGPU: Always allocate emergency stack slot at offset 0 This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877	2017-02-22 21:05:25 +00:00
Matt Arsenault	39787bdcbb	Reapply "AMDGPU: Don't use offen if it is 0" This reverts r283003 llvm-svn: 285203	2016-10-26 15:08:16 +00:00
Mehdi Amini	86eeda8e20	Revert "AMDGPU: Don't use offen if it is 0" This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003	2016-10-01 02:35:24 +00:00
Matt Arsenault	3070fdf798	AMDGPU: Don't use offen if it is 0 This removes many re-initializations of a base register to 0. llvm-svn: 282999	2016-10-01 01:37:15 +00:00
Matt Arsenault	03d8584590	AMDGPU: Move subtarget feature checks into passes llvm-svn: 273937	2016-06-27 20:32:13 +00:00
Tom Stellard	45bb48ea19	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00

14 Commits