This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d60873fdc )
D145586 was originally introduced to help with SWDEV-363662, and it did, but
it also caused a 25% drop in performance in
some MIOpen benchmarks where, it seems,
functions are inlined more conservatively.
This patch restores the pre-D145586 behavior
for PromoteAlloca: functions with a non-entry CC
have a 32 VGPRs threshold, but only if the function
is not marked with "alwaysinline".
A good number of AMDGPU code makes uses of
the AMDGPUAlwaysInline pass anyway, so in our
backend "alwaysinline" seems very common.
This change does not affect SWDEV-363662 (the motivating issue for introducing D145586).
Fixes SWDEV-399519
Reviewed By: rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D150551
Apparently it was used to work around some issue that has been fixed.
Removing it helps with high scratch usage observed in some cases due to failed alloca promotion.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D145586
Non-entry functions have 32 caller saved VGPRs available. If we
promote alloca to consume more registers we will have to spill
CSRs. There is no reason to eliminate scratch access to get
another scratch access instead.
Differential Revision: https://reviews.llvm.org/D110372