llvm-project

Author	SHA1	Message	Date
CHANDRA GHALE	afbcf9529a	[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (#134709 ) Codegen support for reduction over private variable with reduction clause. Section 7.6.10 in in OpenMP 6.0 spec. - An internal shared copy is initialized with an initializer value. - The shared copy is updated by combining its value with the values from the private copies created by the clause. - Once an encountering thread verifies that all updates are complete, its original list item is updated by merging its value with that of the shared copy and then broadcast to all threads. Sample Test Case from OpenMP 6.0 Example ``` #include <assert.h> #include <omp.h> #define N 10 void do_red(int n, int *v, int &sum_v) { sum_v = 0; // sum_v is private #pragma omp for reduction(original(private),+: sum_v) for (int i = 0; i < n; i++) { sum_v += v[i]; } } int main(void) { int v[N]; for (int i = 0; i < N; i++) v[i] = i; #pragma omp parallel num_threads(4) { int s_v; // s_v is private do_red(N, v, s_v); assert(s_v == 45); } return 0; } ``` Expected Codegen: ``` // A shared global/static variable is introduced for the reduction result. // This variable is initialized (e.g., using memset or a UDR initializer) // e.g., .omp.reduction.internal_private_var // Barrier before any thread performs combination call void @__kmpc_barrier(...) // Initialization block (executed by thread 0) // e.g., call void @llvm.memset.p0.i64(...) or call @udr_initializer(...) call void @__kmpc_critical(...) // Inside critical section: // Load the current value from the shared variable // Load the thread-local private variable's value // Perform the reduction operation // Store the result back to the shared variable call void @__kmpc_end_critical(...) // Barrier after all threads complete their combinations call void @__kmpc_barrier(...) // Broadcast phase: // Load the final result from the shared variable) // Store the final result to the original private variable in each thread // Final barrier after broadcast call void @__kmpc_barrier(...) ``` --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-06-11 14:01:31 +05:30
Mats Jun Larsen	94fad11307	[CodeGen] Replace PointerType::getUnqual(Type) with opaque pointer version (NFC) (#128711 ) pointer version (NFC) Follow-up to #123569	2025-03-03 23:00:32 +01:00
Sergio Afonso	27bc6bdaba	[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050 ) This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values.	2025-01-14 11:08:55 +00:00
Akash Banerjee	6f0e9c4a56	[OpenMP][Clang] Migrate OpenMP UserDefinedMapper from Clang to OMPIRBuilder (#110001 ) This patch migrates the OpenMP UserDefinedMapper codegen from Clang to the OpenMPIRBuilder. I will be adding further patches in the near future so that OpenMP dialect in MLIR can make use of these.	2024-12-18 15:02:14 +00:00
CHANDRA GHALE	76e6c8d3fc	Codegen changes for strict modifier with grainsize/num_tasks of taskloop construct (#117196 ) Initial parsing/sema for 'strict' modifier with 'num_tasks' and ‘grainsize’ clause is present in these commits [grainsize_parsing](`ab9eac762c`) and [num_tasks_parsing](`56c1660170 (diff-4184486638e85284c3a2c961a81e7752231022daf97e411007c13a6732b50db9R6545)`) . However, this implementation appears incomplete as it lacks code generation support. A runtime patch was introduced in this runtime commit [runtime_patch](`540007b427 (diff-5e95f9319910d6965d09c301359dbe6b23f3eef5ce4d262ef2c2d2137875b5c4R374)`) , which adds a new API, _kmpc_taskloop_5, to accommodate the strict modifier. In this patch I have added codegen support. When the strict modifier is present alongside the grainsize or num_tasks clauses of taskloop construct, the code now emits a call to _kmpc_taskloop_5, which includes an additional parameter of type i32 with the value 1 to indicate the strict modifier. If the strict modifier is not present, it falls back to the existing _kmpc_taskloop API call. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2024-11-28 14:18:59 +05:30
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
Gheorghe-Teodor Bercea	1a478a69bc	[OpenMP][offload] Fix dynamic schedule tracking (#97065 ) This patch fixes the dynamic schedule tracking.	2024-07-01 10:23:11 -04:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Akira Hatanaka	f75eebab88	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 )" (#86898 ) This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c. The commit broke ubsan bots.	2024-03-27 18:14:04 -07:00
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
Joseph Huber	cc374d8056	[OpenMP] Remove `register_requires` global constructor (#80460 ) Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.	2024-02-21 11:33:32 -06:00
Joseph Huber	237adfca4e	[OpenMP] Rework handling of global ctor/dtors in OpenMP (#71739 ) Summary: This patch reworks how we handle global constructors in OpenMP. Previously, we emitted individual kernels that were all registered and called individually. In order to provide more generic support, this patch moves all handling of this to the target backend and the runtime plugin. This has the benefit of supporting the GNU extensions for constructors an destructors, removing a class of failures related to shared library destruction order, and allows targets other than OpenMP to use the same support without needing to change the frontend. This is primarily done by calling kernels that the backend emits to iterate a list of ctor / dtor functions. For x64, this is automatic and we get it for free with the standard `dlopen` handling. For AMDGPU, we emit `amdgcn.device.init` and `amdgcn.device.fini` functions which handle everything atuomatically and simply need to be called. For NVPTX, a patch https://github.com/llvm/llvm-project/pull/71549 provides the kernels to call, but the runtime needs to set up the array manually by pulling out all the known constructor / destructor functions. One concession that this patch requires is the change that for GPU targets in OpenMP offloading we will use `llvm.global_dtors` instead of using `atexit`. This is because `atexit` is a separate runtime function that does not mesh well with the handling we're trying to do here. This should be equivalent in all cases except for cases where we would need to destruct manually such as: ``` struct S { ~S() { foo(); } }; void foo() { static S s; } ``` However this is broken in many other ways on the GPU, so it is not regressing any support, simply increasing the scope of what we can handle. This changes the handling of ctors / dtors. This patch now outputs a information message regarding the deprecation if the old format is used. This will be completely removed in a later release. Depends on: https://github.com/llvm/llvm-project/pull/71549	2023-11-10 14:53:53 -06:00
Johannes Doerfert	31b91213bd	[OpenMP] Unify the min/max thread/teams pathways We used to pass the min/max threads/teams values through different paths from the frontend to the middle end. This simplifies the situation by passing the values once, only when we will create the KernelEnvironment, which contains the values. At that point we also manifest the metadata, as appropriate. Some footguns have also been removed, e.g., our target check is now triple-based, not calling convention-based, as the latter is dependent on the ordering of operations. The types of the values have been unified to int32_t.	2023-10-29 10:53:20 -07:00
Johannes Doerfert	0ba57c8bba	[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247 ) We now provide the information about the min/max thread and team count from to the OMPIRBuilder, no matter what the source was. That means we unify `thread_limit`, `num_teams`, `num_threads` handling with the target specific attriutes (`__launch_bounds__` and `amdgpu_flat_work_group_size`). This is in preparation to pass the values to the runtime, and to allow the middle-end (OpenMP-opt) to tighten the values if it seems appropriate. There is no "real" change after this commit.	2023-10-26 14:45:07 -07:00
Sandeep Kosuri	08bbff4aad	[OpenMP] Codegen support for thread_limit on target directive for host offloading - This patch adds support for thread_limit clause on target directive according to OpenMP 51 [2.14.5] - The idea is to create an outer task for target region, when there is a thread_limit clause, and manipulate the thread_limit of task instead. This way, thread_limit will be applied to all the relevant constructs enclosed by the target region. Differential Revision: https://reviews.llvm.org/D152054	2023-08-26 22:18:49 -05:00
Joseph Huber	9da61aed75	[OpenMP] Emit offloading entries for indirect target variables OpenMP 5.1 allows emission of the `indirect` clause on declare target functions, see https://www.openmp.org/spec-html/5.1/openmpsu70.html#x98-1080002.14.7. The intended use of this is to permit calling device functions via their associated host pointer. In order to do this the first step will be building a map associating these variables. Doing this will require the same offloading entry handling we use for other kernels and globals. We intentionally emit a new global on the device side. Although it's possible to look up the device function's address directly, this would require changing the visibility and would prevent us from making static functions indirect. Also, the CUDA toolchain will optimize out unused functions and using a global prevents that. The downside is that the runtime will need to read the global and copy its value, but there shouldn't be any other costs. Note that this patch just performs the codegen, currently this new offloading entry type is unused and will be ignored by the runtime. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D157738	2023-08-24 18:21:13 -05:00
Johannes Doerfert	c5488c8dcc	[OpenMP] Properly set static thread limit (w/o analysis) We used to have two separate implementations to derive the number of threads used in a target region. This lead us to sometimes miss out on user provided thread bounds (num_threads, or thread_limit) when we looked for "constant default values". If we might miss out on the presence of those bounds, we cannot set the thread_limit statically since the runtime will try to honor user input rather than cap it at the "preferred default". This patch replaces the secondary implementation with the primary in a mode that will not emit code but just look for the presence, and potentially upper bounds, of thread limiting clauses. The runtime test would not pass without this rewrite as we missed some clauses, set the static limit on the device to the preferred value, but then violated that value at runtime. Fixes: https://github.com/llvm/llvm-project/issues/64845 Differential Revision: https://reviews.llvm.org/D158381	2023-08-23 11:12:03 -07:00
Akash Banerjee	5d9ccd7a96	[OpenMP] Migrate dispatch related utility functions from Clang codegen to OMPIRBuilder Migrate createForStaticInitFunction, createDispatchInitFunction, createDispatchNextFunction and createDispatchFiniFunction from Clang CodeGen to OMPIRBuilder. Differential Revision: https://reviews.llvm.org/D157994	2023-08-16 16:35:28 +01:00
Akash Banerjee	227012cbd7	[OpenMP] Migrate device code privatisation from Clang CodeGen to OMPIRBuilder This patch migrates the UseDevicePtr and UseDeviceAddr clause related code for handling privatisation from Clang codegen to the OMPIRBuilder Depends on D150860 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D152554	2023-07-12 12:03:28 +01:00
Sergio Afonso	63ca93c7d1	[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change. `IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs. Differential Revision: https://reviews.llvm.org/D154591	2023-07-10 14:14:16 +01:00
Jennifer Yu	f70967fdc4	[OPENMP52] Support Support omp_cur_iteration modifier for doacross clause. This is just syntax to make it easier for the user. It doesn't add any new functionality. for doacross(sink: omp_cur_iteration - 1) Equivalent to doacross(sink: ConterVar - 1, ...) doacross(source: omp_cur_iteration) Equivalent to doacross(source) And restriction is: OMP5.2 p.327 If vector is specified with the omp_cur_iteration keyword and with sink as the dependence-type then it must be omp_cur_iteration - 1. If vector is specified with source as the dependence-type then it must be omp_cur_iteration. Differential Revision: https://reviews.llvm.org/D154556	2023-07-06 11:40:02 -07:00
Doru Bercea	13888870e5	Enable dynamic-sized VLAs for data sharing in OpenMP offloaded target regions. Review: https://reviews.llvm.org/D153883	2023-07-06 10:57:10 -04:00
Jennifer Yu	35041a435d	[OPENMP52] Codegen support for doacross clause. Differential Revision: https://reviews.llvm.org/D154180	2023-07-03 15:24:05 -07:00
Jan Sjodin	ac65cc1215	[OpenMP][OpenMPIRBuilder] Migrate kernel launch code and host fallback code generation from Clang to the OpenMPIRBuilder This patch refactors the code generation that emits the offloading kernel launch and moves the core portion to the OpenMPIRBuilder so that it can be used from flang in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D151035	2023-06-30 10:40:21 -04:00
Manna, Soumi	5e12f5ab2d	[CLANG] Fix uninitialized scalar field issues Reviewed By: erichkeane, steakhal, tahonermann, shafik Differential Revision: https://reviews.llvm.org/D150744	2023-06-22 12:09:14 -07:00
Andrew Gozillon	48c3ae5cc3	[Clang][Flang][OpenMP] Add loadOffloadInfoMetadata and createOffloadEntriesAndInfoMetadata into OMPIRBuilder's finalize and initialize This allows the generation of OpenMP offload metadata for the OpenMP dialect when lowering to LLVM-IR and moves some of the shared logic between the OpenMP Dialect and Clang into the IRBuilder. Reviewers: jsjodin, jdoerfert, kiranchandramohan Differential Revision: https://reviews.llvm.org/D148370	2023-05-16 11:51:36 -05:00
Itay Bookstein	782c59a4ee	[OpenMP] Prefix outlined and reduction func names with original func's name This patch prefixes omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 23:00:26 +03:00
Itay Bookstein	6fdd13e0ec	Revert "[OpenMP] Prefix outlined and reduction func names with original func's name" This reverts commit 029bfc311d4d7d3cd90be81bb08c046848796d02.	2023-04-19 19:08:49 +03:00
Itay Bookstein	029bfc311d	[OpenMP] Prefix outlined and reduction func names with original func's name This patch attempts to prefix omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 19:05:21 +03:00
Jan Sjodin	85faee6992	[OpenMP][OMPIRBuilder] Make OffloadEntriesInfoManager a member of OpenMPIRBuilder This patch adds the OffloadEntriesInfoManager to the OpenMPIRBuilder, and allows the OffloadEntriesInfoManager to access the Configuration in the OpenMPIRBuilder. With the shared Config there is no risk for inconsistencies, and there is no longer the need for clang to have a separate OffloadEntriesInfoManager. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D146549	2023-03-23 11:46:28 -04:00
Sunil Kuravinakop	e9babe7571	[OpenMP] Clang Support for taskwait nowait clause Support for taskwait nowait clause with placeholder for runtime changes. Reviewed By: cchen, ABataev Differential Revision: https://reviews.llvm.org/D131830	2022-12-20 12:13:56 -06:00
Chi Chun Chen	e0fd86db09	Revert "[OpenMP] Clang Support for taskwait nowait clause" This reverts commit 100dfe7a8ad3789a98df623482b88d9a3a02e176.	2022-12-09 11:06:45 -06:00
Jennifer Yu	af781f7042	[OPENMP51]Codegen for error directive. Added codegen for `omp error` directive. This is to generate IR to call: void __kmpc_error(ident_t loc, int severity, const char message); Differential Revision: https://reviews.llvm.org/D139166	2022-12-08 13:07:08 -08:00
Sunil K	100dfe7a8a	[OpenMP] Clang Support for taskwait nowait clause Support for taskwait nowait clause with placeholder for runtime changes. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D131830	2022-12-08 12:40:44 -08:00
Kazu Hirata	bb666c6930	[CodeGen] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 11:13:43 -08:00
Jan Sjodin	2aa338f68e	[OpenMP][OMPIRBuilder] Mirgrate getName from clang to OMPIRBuilder This change moves the getName function from clang and moves the separator class members from CGOpenMPRuntime into OMPIRBuilder. Also enusre all the getters in the config class are const. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D137725	2022-11-24 10:11:13 -05:00
Jan Sjodin	969d787a47	[OpenMP][OMPIRBuilder] Add a configuration class that captures flags that affect codegen This patch introudces the OpenMPIRBuilderConfig class which contains various flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to keep the flags in one place so they do not have to be passed in every time. The flags can be set optionally since some uses cases don't rely on functions that depend on these flags. Reviewed By: jdoerfert, tschuett Differential Revision: https://reviews.llvm.org/D138220	2022-11-22 09:25:04 -05:00
Akash Banerjee	87f652d31f	Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder. This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef Differential Revision: https://reviews.llvm.org/D137720	2022-11-14 17:18:10 +00:00
Rageking8	94738a5ac3	Fix duplicate word typos; NFC This revision fixes typos where there are 2 consecutive words which are duplicated. There should be no code changes in this revision (only changes to comments and docs). Do let me know if there are any undesirable changes in this revision. Thanks.	2022-11-08 07:21:23 -05:00
Jan Sjodin	9ea2b150b5	[OpenMP][OMPIRBuilder] Migrate createOffloadEntriesAndInfoMetadata from clang to OpenMPIRBuilder This patch moves the createOffloadEntriesAndInfoMetadata to OpenMPIRBuilder, the createOffloadEntry helper function. The clang specific error handling is invoked using a callback. This code will also be used by flang in the future.	2022-11-03 10:27:44 -04:00
Jan Sjodin	dd3d8ddb5f	[OpenMP][OpenMPIRBuilder] Migrate OffloadEntriesInfoManager from clang to OMPIRbuilder This patch moves the implementation of the OffloadEntriesInfoManager to the OMPIRbuilder. This class will later be used by flang as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135786	2022-10-16 08:32:40 -04:00
Jan Sjodin	4627cef113	[OpenMP][OMPIRBuilder] Migrate emitOffloadingArraysArgument from clang This patch moves the emitOffloadingArraysArgument function and supporting data structures to OpenMPIRBuilder. This will later be used in flang as well. The TargetDataInfo class was split up into generic information and clang-specific data, which remain in clang. Further migration will be done in in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D134662	2022-10-07 07:03:03 -05:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit 7539e9cf811e590d9f12ae39673ca789e26386b4.	2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Kazu Hirata	8b1b0d1d81	Revert "Use std::is_same_v instead of std::is_same (NFC)" This reverts commit c5da37e42d388947a40654b7011f2a820ec51601. This patch seems to break builds with some versions of MSVC.	2022-08-20 23:00:39 -07:00
Kazu Hirata	c5da37e42d	Use std::is_same_v instead of std::is_same (NFC)	2022-08-20 22:36:26 -07:00
Joseph Huber	5300263c70	[OpenMP] Add loop tripcount argument to kernel launch and remove push function Previously we added the `push_target_tripcount` function to send the loop tripcount to the device runtime so we knew how to configure the teams / threads for execute the loop for a teams distribute construct. This was implemented as a separate function mostly to avoid changing the interface for backwards compatbility. Now that we've changed it anyway and the new interface can take an arbitrary number of arguments via the struct without changing the ABI, we can move this to the new interface. This will simplify the runtime by removing unnecessary state between calls. Depends on D128550 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128816	2022-07-08 14:44:16 -04:00
Joseph Huber	643c9b22ef	[OpenMP] Make generating offloading entries more generic This patch moves the logic for generating the offloading entries to the OpenMPIRBuilder. This makes it easier to re-use in other places, such as for OpenMP support in Flang or using the same method for generating offloading entires for other languages like Cuda. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D123460	2022-04-29 09:14:31 -04:00

1 2 3 4 5 ...

306 Commits