llvm-project

Author	SHA1	Message	Date
pvanhout	f104eb6e15	[AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d60873fdc ) D145586 was originally introduced to help with SWDEV-363662, and it did, but it also caused a 25% drop in performance in some MIOpen benchmarks where, it seems, functions are inlined more conservatively. This patch restores the pre-D145586 behavior for PromoteAlloca: functions with a non-entry CC have a 32 VGPRs threshold, but only if the function is not marked with "alwaysinline". A good number of AMDGPU code makes uses of the AMDGPUAlwaysInline pass anyway, so in our backend "alwaysinline" seems very common. This change does not affect SWDEV-363662 (the motivating issue for introducing D145586). Fixes SWDEV-399519 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D150551	2023-05-23 09:01:39 +02:00
pvanhout	fd1d60873f	[AMDGPU] Remove CC exception for Promote Alloca Limits Apparently it was used to work around some issue that has been fixed. Removing it helps with high scratch usage observed in some cases due to failed alloca promotion. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D145586	2023-04-13 08:48:34 +02:00
Roman Lebedev	7850ab2112	[NFC] Port an assortment of tests that invoke SROA to new pass manager	2022-12-01 21:17:18 +03:00
Matt Arsenault	50caf6936b	AMDGPU: Convert promote alloca tests to opaque pointers	2022-11-28 10:36:38 -05:00
Matt Arsenault	1310aa1688	AMDGPU: Use -passes for amdgpu-promote-alloca tests	2022-11-16 17:14:48 -08:00
Stanislav Mekhanoshin	cf74ef134c	[AMDGPU] Limit promote alloca max size in functions Non-entry functions have 32 caller saved VGPRs available. If we promote alloca to consume more registers we will have to spill CSRs. There is no reason to eliminate scratch access to get another scratch access instead. Differential Revision: https://reviews.llvm.org/D110372	2021-09-24 13:38:39 -07:00
Stanislav Mekhanoshin	54e2dc7537	[AMDGPU] Limit promote alloca to vector with VGPR budget Allow only up to 1/4 of available VGPRs for the vectorization of any given alloca. Differential Revision: https://reviews.llvm.org/D82990	2020-07-01 15:57:24 -07:00

7 Commits