75 Commits

Author SHA1 Message Date
Matt Arsenault
1f52060000 AMDGPU: Use poison instead of undef in module lds pass 2023-09-02 11:33:26 -04:00
Juan Manuel MARTINEZ CAAMAÑO
4e43ba2599 [NFC][AMDGPULowerModuleLDSPass] Use shorter APIs in markUsedByKernel
* Use shorter versions of the LLVM API

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D155589
2023-07-19 09:54:53 +02:00
Juan Manuel MARTINEZ CAAMAÑO
fcbafc066c [NFC][AMDGPULowerModuleLDSPass] Cleanup of getTableLookupKernelIndex
* Do a single lookup when querying the map
* Use shorter versions of the LLVM API

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D155588
2023-07-19 09:54:53 +02:00
Jon Chesterfield
6043d4dfec [amdgpu] Accept an optional max to amdgpu-lds-size attribute for use in PromoteAlloca 2023-07-15 21:37:21 +01:00
Jon Chesterfield
d3316bc111 [amdgpu] Delete elide-module-lds attribute
Requires D155190

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155238
2023-07-14 00:36:33 +01:00
Jon Chesterfield
74e928a081 [amdgpu][lds] Remove recalculation of LDS frame from backend
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155190
2023-07-13 23:54:38 +01:00
Jon Chesterfield
9418c40af7 [amdgpu][lds] Raise an explicit unimplemented error on absolute address LDS variables
These aren't implemented. They could be at moderate implementation
complexity. Raising an error is better than silently miscompiling.

Patching now because the patch at D155125 is a step towards using this metadata
more extensively as part of the lowering path and that will interact badly with
input variables with this annotation.

Lowering user defined variables at specific addresses would drop this error,
put them at the requested position in the frame during this pass, and then
use the same codegen that will be used for the kernel specific struct shortly.

Reviewed By: jmmartinez

Differential Revision: https://reviews.llvm.org/D155132
2023-07-13 11:32:03 +01:00
Juan Manuel MARTINEZ CAAMAÑO
367b1f28db [NFC][AMDGPULowerModuleLDSPass] Fix buildbot santizier failed to compile
It seems that the sanitizer-x86_64-linux-android wasn't able to deduce
the template argument:

  AMDGPULowerModuleLDSPass.cpp:1192:53: error: no viable constructor or
  deduction guide for deduction of template arguments of 'vector'
        auto TableLookupVariablesOrdered = sortByName(std::vector(

This patch makes the template argument explicit.
2023-07-12 11:08:16 +02:00
Juan Manuel MARTINEZ CAAMAÑO
3a75551e85 Reland "[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code"
Fixed compilation error and reudndant copy warning

Differential Revision: https://reviews.llvm.org/D154977
2023-07-12 09:27:20 +02:00
Jon Chesterfield
e75ce77cd7 [amdgpu][lds] Fix missing markUsedByKernel calls and undef lookup table elements
More robust association between the kernels and lds struct.

Use poison instead of value() for lookup table elements introduced by dynamic lds lowering.

Extracted from D154946, new test from there verbatim. Segv fixed.

Fixes issues/63338

Fixes SWDEV-404491

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154972
2023-07-12 00:37:21 +01:00
Juan Manuel MARTINEZ CAAMAÑO
ebdd610ad4 Revert "[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code"
This reverts commit 125b90749a98d6dc6b492883c9617f9e91ab60e0.
2023-07-11 17:08:59 +02:00
Juan Manuel MARTINEZ CAAMAÑO
125b90749a [NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D154970
2023-07-11 17:03:58 +02:00
Juan Manuel MARTINEZ CAAMAÑO
70bb5d2b9d [NFC][AMDGPULowerModuleLDSPass] Add const to some variables/parameters
Moving out some changes not related to the bugfix in https://reviews.llvm.org/D154946

Reviewed By: JonChesterfield, arsenm

Differential Revision: https://reviews.llvm.org/D154959
2023-07-11 15:51:57 +02:00
Juan Manuel MARTINEZ CAAMAÑO
abf081975e [NFC][AMDGPULowerModuleLDSPass] Remove dead variable 2023-07-11 12:35:21 +02:00
Jon Chesterfield
e17c1bb494 [amdgpu][nfc] Update comments on LDS lowering 2023-04-11 10:48:19 +01:00
Jon Chesterfield
0507448d82 [amdgpu] Implement dynamic LDS accesses from non-kernel functions
The premise here is to allow non-kernel functions to locate external LDS variables without using LDS or extra magic SGPRs to do so.

1/ First it crawls the callgraph to work out which external LDS variables are reachable from a given kernel
2/ Then it creates a new `extern char[0]` variable for each kernel, which will alias all the other extern LDS variables because that's the documented behaviour of these variables
3/ The address of that variable is written to a lookup table. The global variable is tagged with metadata to track what address it was allocated at by codegen
4/ The assembler builds the lookup table using the metadata
5/ Any non-kernel functions use the same magic intrinsic used by table lookups of non-dynamic LDS variables to find the address to use

Heavy overlap with the code paths taken for other lowering, in particular the same intrinsic is used to pass the dynamic scope information through the same sgpr as for table lookups of static LDS.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144233
2023-04-04 20:06:34 +01:00
Jon Chesterfield
62951784f0 [amdgpu][nfc] Refactor prior to D144233 to remove noise from diff 2023-04-03 16:47:01 +01:00
Jon Chesterfield
78e6818049 [amdgpu][nfc] clang-format AMDGPULowerModuleLDS for easier merging 2023-03-22 01:49:53 +00:00
Jon Chesterfield
d70e7ea0d1 [amdgpu][nfc] Extract more functions in LowerModuleLDS, mark more methods static 2023-03-22 01:25:28 +00:00
Jon Chesterfield
e8ad2a051c [amdgpu][nfc] Comment and extract two functions in LowerModuleLDS 2023-03-21 23:39:20 +00:00
Jon Chesterfield
d3dda422bf [amdgpu][nfc] Replace ad hoc LDS frame recalculation with absolute_symbol MD
Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently used to build assembler
tables from their addresses.

This is a precursor to lowering dynamic/external LDS accesses from non-kernel
functions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144221
2023-03-12 13:47:48 +00:00
Nikita Popov
576060fb41 [ReplaceConstant] Extract code for expanding users of constant (NFC)
AMDGPU implements some handy code for expanding all constexpr
users of LDS globals. Extract the core logic into ReplaceConstant,
so that it can be reused elsewhere.
2023-03-03 16:09:06 +01:00
Jon Chesterfield
bf579a7049 [amdgpu] Change LDS lowering default to hybrid
Postponed from D139433 until the bug fixed by D139874 could be resolved.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D141852
2023-02-24 15:20:12 +00:00
Guillaume Chatelet
87b6b347fc Revert D141134 "[NFC] Only expose getXXXSize functions in TypeSize"
The patch should be discussed further.

This reverts commit dd56e1c92b0e6e6be249f2d2dd40894e0417223f.
2023-01-06 15:27:50 +00:00
Guillaume Chatelet
dd56e1c92b [NFC] Only expose getXXXSize functions in TypeSize
Currently 'TypeSize' exposes two functions that serve the same purpose:
 - getFixedSize / getFixedValue
 - getKnownMinSize / getKnownMinValue

source : bf82070ea4/llvm/include/llvm/Support/TypeSize.h (L337-L338)

This patch offers to remove one of the two and stick to a single function in the code base.

Differential Revision: https://reviews.llvm.org/D141134
2023-01-06 15:24:52 +00:00
Matt Arsenault
6fed2c90d3 AMDGPU: Diagnose which LDS global failed to lower
Also lowercase the message to start since that seems to be the
prevailing convention for error messages.
2023-01-03 09:31:07 -05:00
Kazu Hirata
7e937d08e1 Don't include StringSwitch (NFC)
These files do not use llvm::StringSwitch.
2022-12-14 21:50:34 -08:00
Matt Arsenault
aa9bdd50c2 llvm-reduce: Fix invalid reductions with llvm.used
Fixes issue 59413.
2022-12-14 15:06:22 -05:00
Jon Chesterfield
d77ae7f251 [amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D139433
2022-12-07 22:02:54 +00:00
Nico Weber
a862d09a92 Revert "[amdgpu] Reimplement LDS lowering"
This reverts commit 982017240d7f25a8a6969b8b73dc51f9ac5b93ed.
Breaks check-llvm, see https://reviews.llvm.org/D139433#3974862
2022-12-06 12:01:36 -05:00
Jon Chesterfield
982017240d [amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D139433
2022-12-06 16:28:15 +00:00
Jon Chesterfield
56202c51d4 Revert "[amdgpu][lds] Use the same isKernel predicate consistently"
Looks like this composed poorly with a nominally independent patch, will fix
This reverts commit 0ba0398517778514eb44cb7ba9bf9d4d20a856e0.
2022-11-09 16:54:20 +00:00
Jon Chesterfield
0ba0398517 [amdgpu][lds] Use the same isKernel predicate consistently
isKernelCC != isKernel(F->getCallingConv())
There's a test case (lower-kernel-lds.ll) that explicitly skips amdgpu_ps
so this change picks the isKernel predicate that continues to skip that
calling convention.

isKernel returns true for AMDGPU_KERNEL and SPIR_KERNEL. isKernelCC also
returns true for other calling conventions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136599
2022-11-09 16:45:05 +00:00
Jon Chesterfield
e9a2aa6355 [amdgpu][lds] Use a consistent order of fields in generated structs
Avoids spurious and confusing test failures on changing implementation.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136598
2022-11-09 15:57:41 +00:00
Jon Chesterfield
35f2584ef9 [amdgpu] Error, instead of miscompile, anonymous kernels using lds
The association between kernel and struct is done by symbol name.
This doesn't work robustly for anonymous kernels as shown by the modified
test case.

An alternative association between function and struct can be constructed
if necessary, probably though metadata, but on the basis that we currently
miscompile anonymous kernels and that they are difficult to construct from
application code and difficult to call from the runtime, this patch makes
it a fatal error for now.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134741
2022-09-28 16:30:04 +01:00
Jon Chesterfield
cdb9738963 [amdgpu] Expand all ConstantExpr users of LDS variables in instructions
Bug noted in D112717 can be sidestepped with this change.

Expanding all ConstantExpr involved with LDS up front makes the variable specialisation simpler. Excludes ConstantExpr that don't access LDS to avoid disturbing codegen elsewhere.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D133422
2022-09-14 07:55:46 +01:00
Jon Chesterfield
23f6c8d635 [amdgpu] Always, instead of mostly, remove unused LDS symbols
Currently LDS variables are removed by the lower module pass
if they have a use which is caught by the replace with struct control flow.
This makes tests brittle to changes to that control flow which induces
noise when trying to improve lowering. Some tests already check that
variables are removed, while others checked that they are not removed.

LDS variables are not (currently) externally accessible, and if that
changes the machinery which makes them externally accessible will look
like a use. This change therefore breaks no applications.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D133028
2022-09-07 18:28:16 +01:00
Kazu Hirata
fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Jon Chesterfield
a28bbd00c6 [amdgpu][nfc] Factor predicate out of findLDSVariablesToLower 2022-08-31 15:44:51 +01:00
Dmitri Gribenko
b435da027d [amdgpu][nfc] Fix build with a certan Clang version
It errors out in the Bazel CI:

AMDGPULowerModuleLDSPass.cpp:384:12: error: chosen constructor is
explicit in copy-initialization
    return {SGV, std::move(Map)};

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D130623
2022-07-27 17:29:36 +02:00
Jon Chesterfield
3ccd88f209 [amdgpu][nfc] Separate processUsedLDS into independent pieces, rename it 2022-07-27 01:55:43 +01:00
Jon Chesterfield
9981afdd42 [amdgpu][nfc] Extract kernel annotation from processUsedLDS 2022-07-27 01:38:41 +01:00
Jon Chesterfield
923b90bddb [amdgpu][nfc] Separate LDS struct creation from RAUW 2022-07-26 20:59:17 +01:00
Jon Chesterfield
26dcc7e64a [amdgpu][nfc] Skip operations on padding fields in LDS struct 2022-07-26 18:31:02 +01:00
Jon Chesterfield
2224bbcd74 [nfc][amdgpu] LDS. Move selection logic up the stack. 2022-07-19 17:20:19 +01:00
Jon Chesterfield
eda2bcad02 [nfc][amdgpu] Remove dead variable and function 2022-07-15 23:56:43 +01:00
Jon Chesterfield
bc78c09952 [amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.

Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.

Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122091
2022-05-04 22:42:07 +01:00
Sebastian Neubauer
6527b2a4d5 [AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235
2022-02-18 15:05:21 +01:00
Stanislav Mekhanoshin
c7eb846345 [AMDGPU] Merge AMDGPULDSUtils into AMDGPUMemoryUtils
Differential Revision: https://reviews.llvm.org/D119502
2022-02-11 10:32:24 -08:00
Jon Chesterfield
24b28db8cc [amdgpu] Increase alignment of all LDS variables
Currently the superalign option only increases the alignment of
variables that are moved into the module.lds block. Change that to all LDS
variables. Also only increase the alignment once, instead of once per function.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D115488
2021-12-12 19:30:32 +00:00