* Do a single lookup when querying the map
* Use shorter versions of the LLVM API
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155588
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.
After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.
Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D155190
These aren't implemented. They could be at moderate implementation
complexity. Raising an error is better than silently miscompiling.
Patching now because the patch at D155125 is a step towards using this metadata
more extensively as part of the lowering path and that will interact badly with
input variables with this annotation.
Lowering user defined variables at specific addresses would drop this error,
put them at the requested position in the frame during this pass, and then
use the same codegen that will be used for the kernel specific struct shortly.
Reviewed By: jmmartinez
Differential Revision: https://reviews.llvm.org/D155132
It seems that the sanitizer-x86_64-linux-android wasn't able to deduce
the template argument:
AMDGPULowerModuleLDSPass.cpp:1192:53: error: no viable constructor or
deduction guide for deduction of template arguments of 'vector'
auto TableLookupVariablesOrdered = sortByName(std::vector(
This patch makes the template argument explicit.
More robust association between the kernels and lds struct.
Use poison instead of value() for lookup table elements introduced by dynamic lds lowering.
Extracted from D154946, new test from there verbatim. Segv fixed.
Fixes issues/63338
Fixes SWDEV-404491
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154972
The premise here is to allow non-kernel functions to locate external LDS variables without using LDS or extra magic SGPRs to do so.
1/ First it crawls the callgraph to work out which external LDS variables are reachable from a given kernel
2/ Then it creates a new `extern char[0]` variable for each kernel, which will alias all the other extern LDS variables because that's the documented behaviour of these variables
3/ The address of that variable is written to a lookup table. The global variable is tagged with metadata to track what address it was allocated at by codegen
4/ The assembler builds the lookup table using the metadata
5/ Any non-kernel functions use the same magic intrinsic used by table lookups of non-dynamic LDS variables to find the address to use
Heavy overlap with the code paths taken for other lowering, in particular the same intrinsic is used to pass the dynamic scope information through the same sgpr as for table lookups of static LDS.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D144233
Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently used to build assembler
tables from their addresses.
This is a precursor to lowering dynamic/external LDS accesses from non-kernel
functions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D144221
AMDGPU implements some handy code for expanding all constexpr
users of LDS globals. Extract the core logic into ReplaceConstant,
so that it can be reused elsewhere.
Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.
Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139433
Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.
Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139433
isKernelCC != isKernel(F->getCallingConv())
There's a test case (lower-kernel-lds.ll) that explicitly skips amdgpu_ps
so this change picks the isKernel predicate that continues to skip that
calling convention.
isKernel returns true for AMDGPU_KERNEL and SPIR_KERNEL. isKernelCC also
returns true for other calling conventions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D136599
The association between kernel and struct is done by symbol name.
This doesn't work robustly for anonymous kernels as shown by the modified
test case.
An alternative association between function and struct can be constructed
if necessary, probably though metadata, but on the basis that we currently
miscompile anonymous kernels and that they are difficult to construct from
application code and difficult to call from the runtime, this patch makes
it a fatal error for now.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134741
Bug noted in D112717 can be sidestepped with this change.
Expanding all ConstantExpr involved with LDS up front makes the variable specialisation simpler. Excludes ConstantExpr that don't access LDS to avoid disturbing codegen elsewhere.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D133422
Currently LDS variables are removed by the lower module pass
if they have a use which is caught by the replace with struct control flow.
This makes tests brittle to changes to that control flow which induces
noise when trying to improve lowering. Some tests already check that
variables are removed, while others checked that they are not removed.
LDS variables are not (currently) externally accessible, and if that
changes the machinery which makes them externally accessible will look
like a use. This change therefore breaks no applications.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D133028
It errors out in the Bazel CI:
AMDGPULowerModuleLDSPass.cpp:384:12: error: chosen constructor is
explicit in copy-initialization
return {SGV, std::move(Map)};
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D130623
Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.
Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.
Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122091
Currently the superalign option only increases the alignment of
variables that are moved into the module.lds block. Change that to all LDS
variables. Also only increase the alignment once, instead of once per function.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D115488