llvm-project

Author	SHA1	Message	Date
Joseph Huber	a3bd87b100	[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815 ) Summary: The AMDGPU backend uses the linker-provided INIT_ARRAY and FINI_ARRAY sections to call all the global constructors in a single kernel. Previously this mistakenly used the same iteration logic for both arrays. The destructors stored in FINI_ARRAY are stored in the same order as the ones in the INIT_ARRAY section so we need to traverse it in reverse order. Relanding after the revert in fe7b5e2cfcf6848287010291081f85fa1f6bb2ef using the IR builder interface instead of ConstantExpr.	2023-11-10 11:01:02 -06:00
Nikita Popov	fe7b5e2cfc	Revert "[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815 )" This reverts commit c1d5865a313d0a8a254b37c852bdd444453c0f73. Introduces a new use of ConstantExpr::getAShr().	2023-11-10 17:01:06 +01:00
Joseph Huber	c1d5865a31	[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815 ) Summary: The AMDGPU backend uses the linker-provided INIT_ARRAY and FINI_ARRAY sections to call all the global constructors in a single kernel. Previously this mistakenly used the same iteration logic for both arrays. The destructors stored in FINI_ARRAY are stored in the same order as the ones in the INIT_ARRAY section so we need to traverse it in reverse order.	2023-11-10 09:34:04 -06:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Alex Richardson	83c4227ab7	Auto-generate test checks for tests affected by D141060 These files had manual CHECK lines which make the diff from D141060 very difficult to review.	2023-10-04 10:51:35 -07:00
Joseph Huber	3590945a11	[AMDGPU] Add attribute to AMDGPU ctor / dtor to indicate single threadedness We only expect these ctor / dtor functions to be called with a single thread. Add the appropriate attributes to indicate this to the backend. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D151153	2023-05-24 07:24:17 -05:00
Joseph Huber	4a1236e0f6	[AMDGPU] Add an option to disable manual ctor / dtor lowering Currently AMDGPU offers extra ctor / dtor lowering by emitting a kernel that can be called. It's possible to handle ctors and dtors using the standard method as shown in D149340's commit message. In which case we on't need these extra kernels as they won't be called. This patch simply adds a way to conditionally turn off this handling if we do not want to get extra kernels in the output. Unrelated, but we could convert this handling to an ODR function that simply calls the code in D149340 constructed via LLVM-IR. That would handle priority correctly and would then be correct if not run in LTO mode. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D150565	2023-05-23 09:03:10 -05:00
Joseph Huber	8b132747cd	[AMDGPU] Rewrite device ctor / dtor handling to use .init / .fini sections Currently, AMDGPU has special handling for constructors and destructors. We manuall emit a kernel that calls the functoins listed in the global constructor / destructor list. This currently has two main problems. The first is that we do not repsect the priortiy and simply call them in any order. The second is that we redefine the symbol unconditionally which coulid have a different definition, meaning we cannot merge any code with a constructor post-codegen. This patch changes the handling to instead use the standard support for travering the `.init_array` and `.fini_array` sections the compiler creates. This allows us to emit a single kernel with `odr` semantics, so even if we emit this multiple times they will be merged into a single kernel. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D150675	2023-05-19 16:22:01 -05:00
Joseph Huber	a1da746157	[AMDGPU] Place global constructors in .init_array and .fini_array For the GPU, we emit external kernels that call the initializers and constructors, however if we had a persistent kernel like in the `_start` kernel for the `libc` project, we could initialize the standard way of calling constructors. This patch adds new global variables containing pointers to the constructors to be called. If these are placed in the `.init_array` and `.fini_array` sections, then the backend will handle them specially. The linker will then provide the `__init_array_` and `__fini_array_` sections to traverse them. An implementation would look like this. ``` extern uintptr_t __init_array_start[]; extern uintptr_t __init_array_end[]; extern uintptr_t __fini_array_start[]; extern uintptr_t __fini_array_end[]; using InitCallback = void(int, char , char ); using FiniCallback = void(void); extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void _start(int argc, char argv, char envp) { uint64_t init_array_size = __init_array_end - __init_array_start; for (uint64_t i = 0; i < init_array_size; ++i) reinterpret_cast<InitCallback >(__init_array_start[i])(argc, argv, env); uint64_t fini_array_size = __fini_array_end - __fini_array_start; for (uint64_t i = 0; i < fini_array_size; ++i) reinterpret_cast<FiniCallback >(__fini_array_start[i])(); } ``` Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D149340	2023-04-29 08:40:19 -05:00
Nikita Popov	bdf2fbba9c	[AMDGPU] Convert some tests to opaque pointers (NFC)	2022-12-19 12:41:13 +01:00
Matt Arsenault	9cc0779c4e	AMDGPU: Erase llvm.global_ctors/global_dtors after lowering We should be able to run the pass multiple times without breaking anything. If we still need to track these for some reason, we could replace with new entries for the kernels.	2022-12-09 14:25:32 -05:00
Matt Arsenault	f23f26032d	AMDGPU: Port AMDGPUCtorDtorLowering to new PM	2022-12-09 13:43:38 -05:00
Praveen Velliengiri	e90b512c4d	[AMDGPU] Change ASAN init/fini kernels linkage to external. HSA runtime fails to find the symbols for Init and Fini kernels as they mark with internal linkage, changing the linkage to external to fix those errors. Differential Revision: https://reviews.llvm.org/D110054	2021-09-27 11:50:37 -06:00
Reshabh Sharma	5173854f19	[AMDGPU] Handle functions in llvm's global ctors and dtors list This patch introduces a new code object metadata field, ".kind" which is used to add support for init and fini kernels. HSAStreamer will use function attributes, "device-init" and "device-fini" to distinguish between init and fini kernels from the regular kernels and will emit metadata with ".kind" set to "init" and "fini" respectively. To reduce the number of init and fini kernels, the ctors and dtors present in the llvm's global.ctors and global.dtors lists are called from a single init and fini kernel respectively. Reviewed by: yaxunl Differential Revision: https://reviews.llvm.org/D105682	2021-08-06 15:53:33 +05:30
Reshabh Sharma	dce35ef104	Revert "[AMDGPU] Handle functions in llvm's global ctors and dtors list" This reverts commit d42e70b3d315645e37f3b1455d39e68678e69525.	2021-08-04 23:33:31 +05:30
Reshabh Sharma	d42e70b3d3	[AMDGPU] Handle functions in llvm's global ctors and dtors list This patch introduces a new code object metadata field, ".kind" which is used to add support for init and fini kernels. HSAStreamer will use function attributes, "device-init" and "device-fini" to distinguish between init and fini kernels from the regular kernels and will emit metadata with ".kind" set to "init" and "fini" respectively. To reduce the number of init and fini kernels, the ctors and dtors present in the llvm's global.ctors and global.dtors lists are called from a single init and fini kernel respectively. Reviewed by: yaxunl Differential Revision: https://reviews.llvm.org/D105682	2021-08-04 19:53:33 +05:30

16 Commits