llvm-project

Author	SHA1	Message	Date
Matt Arsenault	a6fc489bb7	AMDGPU: Add gfx950 subtarget definitions (#116307 ) Mostly a stub, but adds some baseline tests and tests for removed instructions.	2024-11-18 10:41:14 -08:00
Kazu Hirata	e8a6624325	[CodeGen] Remove unused includes (NFC) (#116459 ) Identified with misc-include-cleaner.	2024-11-16 07:37:13 -08:00
Shilei Tian	de0fd64bed	[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190 ) This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.	2024-11-12 23:11:05 -05:00
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
Carl Ritson	076aac59ac	[AMDGPU] Add a new target for gfx1153 (#113138 )	2024-10-23 12:56:58 +09:00
Artem Belevich	30a06e8022	[CUDA] Add support for CUDA-12.6 and sm_100 (#112028 ) This is a copy of #97402(with minor updates), which is now ready to land. --------- Co-authored-by: Sergey Kozub <skozub@nvidia.com>	2024-10-14 11:51:05 -07:00
Youngsuk Kim	29d0a84704	[clang][CGOpenMPRuntimeGPU] Avoid llvm::Type::getPointerTo() (NFC) (#110357 ) `llvm::Type::getPointerTo()` is to be removed soon.	2024-09-28 09:57:20 -04:00
Joseph Huber	e0326b668e	[OpenMP] Map `omp_default_mem_alloc` to global memory (#104790 ) Summary: Currently, we assign this to private memory. This causes failures on some SOLLVE tests. The standard isn't clear on the semantics of this allocation type, but there seems to be a consensus that it's supposed to be shared memory.	2024-08-20 12:00:41 -05:00
Jan Leyonberg	5b15d9c441	[clang][OpenMP] Propoagate debug location to OMPIRBuilder reduction codegen (#100358 ) This patch propagates the debug location from Clang to the OpenMPIRBuilder. Fixes https://github.com/llvm/llvm-project/issues/97458	2024-07-24 08:57:39 -05:00
Jakub Chlanda	ab20086422	[CUDA][NFC] CudaArch to OffloadArch rename (#97028 ) Rename `CudaArch` to `OffloadArch` to better reflect its content and the use. Apply a similar rename to helpers handling the enum.	2024-06-30 07:56:07 +02:00
Kazu Hirata	2f57df5826	[CodeGen] Fix a warning This patch fixes: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1662:8: error: unused variable 'ParallelReduction' [-Werror,-Wunused-variable]	2024-06-26 14:37:06 -07:00
Akash Banerjee	6b1c51bc05	[OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (#80343 ) This patch migrates the CGOpenMPRuntimeGPU::emitReduction and related functions to the OpenMPIRBUilder. In future patches MLIR OpenMP translation would be making use of these functions. Co-authored-by: Jan Leyonberg <jan.leyonberg@amd.com>	2024-06-26 20:18:38 +01:00
Alex Voicu	9acb533c38	[clang][Driver] Add HIPAMD Driver support for AMDGCN flavoured SPIR-V (#95061 ) This patch augments the HIPAMD driver to allow it to target AMDGCN flavoured SPIR-V compilation. It's mostly straightforward, as we re-use some of the existing SPIRV infra, however there are a few notable additions: - we introduce an `amdgcnspirv` offload arch, rather than relying on using `generic` (this is already fairly overloaded) or simply using `spirv` or `spirv64` (we'll want to use these to denote unflavoured SPIRV, once we bring up that capability) - initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU targets, as it would require some relatively intrusive surgery in the HIPAMD Toolchain and the Driver to deal with two triples (`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively) - in order to retain user provided compiler flags and have them available at JIT time, we rely on embedding the command line via `-fembed-bitcode=marker`, which the bitcode writer had previously not implemented for SPIRV; we only allow it conditionally for AMDGCN flavoured SPIRV, and it is handled correctly by the Translator (it ends up as a string literal) Once the SPIRV BE is no longer experimental we'll switch to using that rather than the translator. There's some additional work that'll come via a separate PR around correctly piping through AMDGCN's implementation of `printf`, for now we merely handle its flags correctly.	2024-06-25 12:19:28 +01:00
Shilei Tian	1ca0055f45	[AMDGPU] Add a new target gfx1152 (#94534 )	2024-06-06 12:16:11 -04:00
Konstantin Zhuravlyov	2bfa26d30f	AMDGPU: Add missing gfx* generic targets handling in clang (NVPTX, OpenMP runtime) (#94483 )	2024-06-05 11:57:17 -04:00
Ahmed Bougacha	3575d23ca8	[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465 ) This is in effect a revert of f139ae3d93797, as we have since gained a more sophisticated way of doing extra IRGen with the addition of RawAddress in #86923.	2024-05-20 10:23:04 -07:00
Erich Keane	39adc8f423	[NFC] Generalize ArraySections to work for OpenACC in the future (#89639 ) OpenACC is going to need an array sections implementation that is a simpler version/more restrictive version of the OpenMP version. This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`, then adds an enum to choose between the two. This also fixes a couple of 'drive-by' issues that I discovered on the way, but leaves the OpenACC Sema parts reasonably unimplemented (no semantic analysis implementation), as that will be a followup patch.	2024-04-25 10:22:03 -07:00
Joseph Huber	9e7aab951f	[CUDA] Rename SM_32 to SM_32_ to work around AIX headers (#88779 ) Summary: AIX headers define this, so we need to work around it. In the future this will be removed but for now we should just rename it to avoid these issues.	2024-04-16 07:43:13 -05:00
David Pagan	a12836647e	[OpenMP][CodeGen] Improved codegen for combined loop directives (#87278 ) IR for 'target teams loop' is now dependent on suitability of associated loop-nest. If a loop-nest: - does not contain a function call, or - the -fopenmp-assume-no-nested-parallelism has been specified, - or the call is to an OpenMP API AND - does not contain nested loop bind(parallel) directives then it can be emitted as 'target teams distribute parallel for', which is the current default. Otherwise, it is emitted as 'target teams distribute'. Added debug output indicating how 'target teams loop' was emitted. Flag is -mllvm -debug-only=target-teams-loop-codegen Added LIT tests explicitly verifying 'target teams loop' emitted as a parallel loop and a distribute loop. Updated other 'loop' related tests as needed to reflect change in IR. - These updates account for most of the changed files and additions/deletions.	2024-04-10 13:09:17 -07:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Akira Hatanaka	f75eebab88	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 )" (#86898 ) This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c. The commit broke ubsan bots.	2024-03-27 18:14:04 -07:00
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
Artem Belevich	631c6e834c	[CUDA] Add support for CUDA-12.3 and sm_90a (#74895 )	2023-12-11 12:18:28 -08:00
Youngsuk Kim	d43c081aef	[clang][CGOpenMPRuntimeGPU] Merge consecutive AddrSpaceCasts (NFC) (#74279 ) Merge consecutive AddrSpaceCasts into a single AddrSpaceCast.	2023-12-04 07:03:09 -05:00
Johannes Doerfert	fae233c63f	[OpenMP] Avoid initializing the KernelLaunchEnvironment if possible (#73864 ) If we don't have a team reduction we don't need a kernel launch environment (for now). In that case we can avoid the cost.	2023-11-29 14:49:13 -08:00
Youngsuk Kim	bc6b632723	[CGOpenMPRuntimeGPU] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort	2023-11-25 11:28:18 -06:00
Jay Foad	cf1e0c0b07	[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133 ) Define target names and ELF numbers for new GFX12 targets gfx1200 and gfx1201. For now they behave identically to GFX11.	2023-11-23 16:44:05 +00:00
Youngsuk Kim	b4db24e330	[CGOpenMPRuntimeGPU] Replace unneeded use of CreatePointerBitCastOrAddrSpaceCast (NFC) Opaque ptr cleanup effort (NFC)	2023-11-18 04:17:46 -06:00
Johannes Doerfert	7318fe6334	[OpenMP][FIX] Ensure device reduction geps work for multi-var reductions If we have more than one reduction variable we need to be consistent wrt. indexing. In 3de645efe30b83ba1b6d7e500486c4f441a17a61 we broke this as the buffer type was reduced to a singleton but the index computation was not adjusted to account for that offset. This fixes it by interleaving the reduction variables properly in a array-of-struct style. We can revert it back to struct-of-array in a follow up if turns out to be a problem. I doubt it since half the accesses should benefit from the locallity this layout offers and only the other half were consecutive before.	2023-11-10 14:34:46 -08:00
Johannes Doerfert	3de645efe3	[OpenMP][NFC] Split the reduction buffer size into two components Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.	2023-11-06 11:50:41 -08:00
Johannes Doerfert	921bd29913	[OpenMP] Remove alignment for global <-> local reduction functions The alignment did likely not help much but increases the memory requirement. Note that half of the affected accesses are all performed by a single thread in each block. The reads are by consecutive threads in a single block.	2023-11-06 11:50:41 -08:00
Johannes Doerfert	abe71b77f9	[OpenMP][NFC] Delete dead code	2023-11-06 11:50:41 -08:00
Vlad Serebrennikov	dda8e3de35	[clang][NFC] Refactor `ImplicitParamDecl::ImplicitParamKind` This patch converts `ImplicitParamDecl::ImplicitParamKind` into a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for `preferred_type` annotations on bit-fields.	2023-11-06 12:01:09 +03:00
Vlad Serebrennikov	edd690b02e	[clang][NFC] Refactor `TagTypeKind` (#71160 ) This patch converts TagTypeKind into scoped enum. Among other benefits, this allows us to forward-declare it where necessary.	2023-11-03 21:45:39 +04:00
Johannes Doerfert	d3e7a48cbd	[OpenMP][NFC] Remove a no-op function	2023-11-03 10:28:36 -07:00
Johannes Doerfert	f9a89e6b9c	[OpenMP][FIX] Allocate per launch memory for GPU team reductions (#70752 ) We used to perform team reduction on global memory allocated in the runtime and by clang. This was racy as multiple instances of a kernel, or different kernels with team reductions, would use the same locations. Since we now have the kernel launch environment, we can allocate dynamic memory per-launch, allowing us to move all the state into a non-racy place. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-11-01 11:11:48 -07:00
Vlad Serebrennikov	49fd28d960	[clang][NFC] Refactor `ArrayType::ArraySizeModifier` This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.	2023-10-31 18:06:34 +03:00
Johannes Doerfert	31b91213bd	[OpenMP] Unify the min/max thread/teams pathways We used to pass the min/max threads/teams values through different paths from the frontend to the middle end. This simplifies the situation by passing the values once, only when we will create the KernelEnvironment, which contains the values. At that point we also manifest the metadata, as appropriate. Some footguns have also been removed, e.g., our target check is now triple-based, not calling convention-based, as the latter is dependent on the ordering of operations. The types of the values have been unified to int32_t.	2023-10-29 10:53:20 -07:00
Johannes Doerfert	ab34d71087	[OpenMP][NFC] Remove untested code emitting no-op call	2023-10-26 14:38:24 -07:00
Johannes Doerfert	289a0f255d	[OpenMP] Remove SPMD specific handling during globalization Globalization and SPMD are different things that used to be conflated. Some leftover crossover interactions remain, trying to remove them now.	2023-10-26 14:38:23 -07:00
Shilei Tian	d6254e1b2e	Introduce the initial support for OpenMP kernel language (#66844 ) This patch starts the support for OpenMP kernel language, basically to write OpenMP target region in SIMT style, similar to kernel languages such as CUDA. What included in this first patch is the `ompx_bare` clause for `target teams` directive. When `ompx_bare` exists, globalization is disabled such that local variables will not be globalized. The runtime init/deinit function calls will not be emitted. That being said, almost all OpenMP executable directives are not supported in the region, such as parallel, task. This patch doesn't include the Sema checks for that, so the use of them is UB. Simple directives, such as atomic, can be used. We provide a set of APIs (for C, they are prefix with `ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc. Please refer to https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf for more details.	2023-10-05 17:38:06 -04:00
JP Lehr	1bff5f6d0b	Revert "[OpenMP] Introduce the initial support for OpenMP kernel language (#66844 )" This reverts commit e997dca3333823ffe2ea3aea288299f551532dcd.	2023-09-29 15:35:10 -05:00
Shilei Tian	e997dca333	[OpenMP] Introduce the initial support for OpenMP kernel language (#66844 ) This patch starts the support for OpenMP kernel language, basically to write OpenMP target region in SIMT style, similar to kernel languages such as CUDA. What included in this first patch is the `ompx_bare` clause for `target teams` directive. When `ompx_bare` exists, globalization is disabled such that local variables will not be globalized. The runtime init/deinit function calls will not be emitted. That being said, almost all OpenMP executable directives are not supported in the region, such as parallel, task. This patch doesn't include the Sema checks for that, so the use of them is UB. Simple directives, such as atomic, can be used. We provide a set of APIs (for C, they are prefix with `ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc. For more details, you can refer to https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.	2023-09-29 13:11:09 -04:00
Sergio Afonso	094a63a20b	[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive This patch updates the `OpenMPIRBuilderConfig` structure to hold all available 'requires' clauses, and it replicates part of the code generation for the 'requires' registration function from clang in the `OMPIRBuilder`, to be used with flang. Porting the rest of features of the clang implementation to the IRBuilder and sharing it between clang and flang remains for a future patch, due to the complexity of the logic selecting the attributes of the generated registration function. Differential Revision: https://reviews.llvm.org/D147217	2023-09-14 10:33:54 +01:00
Aaron Jarmusch	131ba0ae01	Revert "[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )" This reverts commit e831a32c93c1ab404785773cc7c08c01730d61e5.	2023-09-12 22:46:09 +00:00
Aaron Jarmusch	e3298bb275	fixup! [Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )	2023-09-12 20:52:33 +00:00
Aaron Jarmusch	e831a32c93	[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 ) Fix for an issue where clang was not adding the address space according to the data layout, instead was using the default which resulted in a crash at times. The fix includes changes to the cases of LargeCapMemAlloc and CGroupMemAlloc where we are setting the AddrSpace according to the DataLayout.	2023-09-12 15:44:39 -04:00

1 2 3 4

172 Commits