llvm-project

Author	SHA1	Message	Date
Daniel Woodworth	ac29405b93	[OpenMPOpt] Fix incorrect end-of-kernel barrier removal (#65670 ) Barrier removal in OpenMPOpt normally removes barriers by proving that they are redundant with barriers preceding them. However, it can't do this with the "pseudo-barrier" at the end of kernels because that can't be removed. Instead, it removes the barriers preceding the end of the kernel which that end-of-kernel barrier is redundant with. However, these barriers aren't always redundant with the end-of-kernel barrier when loops are involved, and removing them can lead to incorrect results in compiled code. This change fixes this by requiring that these pre-end-of-kernel barriers also have the kernel end as a unique successor before removing them. It also changes the initialization of `ExitED` for kernels since the kernel end is not an aligned barrier.	2023-09-27 09:35:42 -07:00
Shilei Tian	186a4b3b65	[LLVM][OpenMP] Allow OpenMPOpt to handle non-OpenMP target regions (#67075 ) Current OpenMPOpt assumes all kernels are OpenMP kernels (aka. with "kernel" attribute). This doesn't hold if we mix OpenMP code and CUDA code by lingking them together because CUDA kernels are not annotated with the attribute. This patch removes the assumption and added a new counter for those non-OpenMP kernels. Fix #66687.	2023-09-23 22:34:07 -04:00
Shilei Tian	22e1df7f5b	[LLVM][OpenMPOpt] Fix a crash when associated function is nullptr (#66274 ) The associated function can be a nullptr if it is an indirect call. This causes a crash in `CheckCallee` which always assumes the callee is a valid pointer. Fix #66904.	2023-09-13 20:22:59 -04:00
Johannes Doerfert	d47cf2bff3	[OpenMPOpt] Allow indirect calls in AAKernelInfoCallSite (#65836 ) The Attributor has gained support for indirect calls but it is opt-in. This patch makes AAKernelInfoCallSite able to handle multiple potential callees.	2023-09-10 19:02:09 -07:00
Shilei Tian	499f691be1	Revert "Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544 )""" This reverts commit c5525a6e8fb7f7c2ce7126ac5b17aaff01ac407f. AMD BB is not happy again.	2023-09-08 15:46:23 -04:00
Shilei Tian	c5525a6e8f	Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544 )"" This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4 that reverts e91e3cf.	2023-09-08 15:39:16 -04:00
Shilei Tian	e592c2dcf5	Revert "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544 )" This reverts commit e91e3cf0748a80e1d7219c13fa6a7622321f4936 because AMD BB is not happy with it.	2023-09-07 12:31:11 -04:00
Shilei Tian	e91e3cf074	[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544 )	2023-09-07 12:23:52 -04:00
Johannes Doerfert	a01398156a	[OpenMPOpt][FIX] Ensure to propagate information about parallel regions Before, we checked the parallel region only once, and ignored updates in the KernelInfo for the parallel region that happened later. This caused us to think nested parallel sections are not present even if they are, among other things.	2023-08-25 10:46:56 -07:00
Johannes Doerfert	8b08287cb3	[OpenMPOpt] Eliminate assumptions only "late" When we remove barriers, we might need to remove llvm.assume assumptions as well. However, doing this early, thus in the module pass, will cause us to miss out on information we might need. There are few situations we can eliminate barriers across functions, for now we simply disable elimination of barriers that require assumptions to be removed during the early module pass.	2023-08-23 16:11:43 -07:00
Johannes Doerfert	9c08e76f3e	[Attributor] Introduce AAIndirectCallInfo AAIndirectCallInfo will collect information and specialize indirect call sites. It is similar to our IndirectCallPromotion but runs as part of the Attributor (so with assumed callee information). It also expands more calls and let's the rest of the pipeline figure out what is UB, for now. We use existing call promotion logic to improve the result, otherwise we rely on the (implicit) function pointer cast. This effectively "fixes" #60327 as it will undo the type punning early enough for the inliner to work with the (now specialized, thus direct) call. Fixes: https://github.com/llvm/llvm-project/issues/60327	2023-08-18 16:44:05 -07:00
Johannes Doerfert	97c24a16fd	[OpenMPOpt][NFC] Allow missing wrapper functions for parallel_51 Clang does not create a wrapper function for SPMD kernels. If it does not, we still want to collect the parallel region, even if we have no use for it right now.	2023-08-17 18:33:24 -07:00
Johannes Doerfert	4fcd5f93d6	[OpenMPOpt] Mark more runtime functions as SPMD compatible Fixes: https://github.com/llvm/llvm-project/issues/64421	2023-08-17 18:33:24 -07:00
Johannes Doerfert	2ece6d939b	[OpenMPOpt] SPMD-amenable implies no unknown parallel regions	2023-08-17 18:33:23 -07:00
Johannes Doerfert	dfc821ae89	[OpenMPOpt][FIX] Ensure a dependence for KernelEnvC queries When other AAs query the current value of KernelEnvC via the callback KernelConfigurationSimplifyCB we need to ensure they are now dependent on the AAKernelInfo that is in charge of the KernelEnvC.	2023-08-10 23:16:25 -07:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Shilei Tian	10068cd654	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-26 13:35:14 -04:00
Joseph Huber	05b181d851	[OpenMP] Make the nested parallelism global hidden Summary: These will probably be removed with the kernel environment, but they should have hidden visibliity so they can be optimized out.	2023-07-24 08:28:54 -05:00
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Nikita Popov	7be7f23269	[llvm] Remove uses of getWithSamePointeeType() (NFC)	2023-07-18 12:07:09 +02:00
Johannes Doerfert	02a4fcec6b	[Attributor] Port AANonNull to the isImpliedByIR interface AANonNull is now the first AA that is always queried via the new APIs and not created manually. Others will follow shortly to avoid trivial AAs whenever possible. This commit introduced some helper logic that will make it simpler to port the next one. It also untangles AADereferenceable and AANonNull such that the former does not keep a handle on the latter. Finally, we stop deducing `nonnull` for `undef`, which was incorrect.	2023-07-09 16:04:19 -07:00
Johannes Doerfert	fe12d313ba	[OpenMPOpt][FIX] Propagate IsReachingAlignedBarrier flag through calls	2023-07-07 16:38:34 -07:00
Johannes Doerfert	24656e995a	[OpenMPOpt] The kernel end is not necessarily an aligned barrier A kernel can be exited in a non-aligned fashion, so we cannot pretend it always ends in an aligned barrier. Instead, we require an explicit aligned barrier as we lack a divergence analysis at this point.	2023-07-07 16:38:34 -07:00
Johannes Doerfert	4009f84d2d	[OpenMPOpt] Check for execution with an aligned barrier If the next or last synchronizing instruction was an aligned barrier, the instruction is executed in an aligned region.	2023-07-07 16:38:33 -07:00
Johannes Doerfert	5faa616fe4	[Attributor][NFCI] Remove the (already "unused") ModuleSlice At some point we alloed the CGSCC traversal to look at the entire module slice (see definition below). However, we don't allow that anymore, mostly for compile time and complexity reasons. Consequently, there is no need to build the ModuleSlice as we can replacve it with the SCC wherever it was still used.	2023-06-29 23:08:11 -07:00
Johannes Doerfert	e962fa7712	[OpenMPOpt][FIX] Internalization is an IR change too The bots reported that we changed the IR w/o reporting it. The reason was that internalization was not reported as changed. Forwarding the result solves the problem. Test coverage via llvm/test/Transforms/Attributor/reduced/openmp_opt_constant_type_crash.ll	2023-06-29 18:03:47 -07:00
Haojian Wu	4b47c6e018	Fix -Wunused-variable in release build.	2023-06-30 00:02:05 +02:00
Johannes Doerfert	d6fa3b374f	[Attributor] Remove now obsolete initialization code With the helpers in place to judge AAs [1] we can remove the custom rolled initialization checking code. This exposed a minor oversight in the AAMemoryLocation where we did not check the IR before we gave up for a declaration. [1] `d33bca840a`	2023-06-29 13:32:06 -07:00
Johannes Doerfert	d33bca840a	[Attributor] Introduce helpers to judge AAs prior to creation This is a partial cleanup to centralize the initialization and update decisions for AAs. Lifting the burdon and boilerplate on users and making it harder to accidentally perform unsound deductions. The two static helpers show how we can lift the decisions to generate an AA into the Attributor, avoiding trivial AAs that just cost us compile time and maintenance code (to check for pre-conditions).	2023-06-29 12:32:45 -07:00
Johannes Doerfert	21c0d6bff9	[OpenMPOpt] Properly check AA pointers The interface was changed to return pointers, so we need to check them for null now at they might actually be null in the future).	2023-06-29 09:18:36 -07:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00
Johannes Doerfert	e9fc399db3	[Attributor][NFCI] Use pointers to pass around AAs This will make it easier to create less trivial AAs in the future as we can simply return `nullptr` rather than an AA with in invalid state.	2023-06-23 17:21:20 -07:00
Johannes Doerfert	cb17c48fdd	[Attributor] Identify and remove no-op fences The logic and implementation follows the removal of no-op barriers. If the fence is not making updates visible, either to the world or the current thread, it is not needed. Said differently, the fences we remove do not establish synchronization (happens-before) edges. This allows us to eliminate some of the regression caused by: https://reviews.llvm.org/D145290	2023-06-05 17:14:00 -07:00
Johannes Doerfert	8f4fadd1b4	[OpenMP] Use "kernel" attribute consistently	2023-06-05 16:33:53 -07:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Bjorn Pettersson	a20f7efbc5	Remove several no longer needed includes. NFCI Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.	2023-04-17 13:54:19 +02:00
Joseph Huber	46ee1021d9	[OpenMP] Replace HeapToShared's initial value with `poison` There's a desire to move away from `undef` in LLVM. Currently we want to have the `addressspace(3)` variables use `poison` instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D147719	2023-04-14 09:39:32 -05:00
Johannes Doerfert	94d14536a9	[OpenMP][FIX] More AAExecutionDomain fixes We missed certain updates, mostly to call site information, and dependent AAs did not get recomputed. We also did not properly distinguish and propagate incoming and outgoing information of call sites. The runtime tests passes now, I'll add a proper test for AAExecutionDomain soon that covers all the cases and ensures we haven't forgotten more updates. To help unblock some apps, I'll put the fix first.	2023-03-27 21:36:21 -07:00
Johannes Doerfert	3a7cb3d45a	[OpenMP] Adjust generic state machine simplification CB This callback caused us to potentially miss out on call edges if we were expecting a custom state machine since the custom state machine was not created but the workers also did not enter the generic one. I have not observed an issue and don't know how to create a test for sure, but it is saver to err on the conservative side for now.	2023-03-27 21:30:23 -07:00
Johannes Doerfert	7f7e1749c5	[OpenMP] Be smarter about the insertion point for deduplication We can use dominance and avoid the special handling of kernels and prevent inserting code before allocas accidentally (as happend in the runtime test).	2023-03-27 21:30:23 -07:00
Kazu Hirata	b9c4b95b11	[llvm] Use ConstantInt::{isZero,isOne} (NFC)	2023-03-21 17:40:35 -07:00
Ishaan Gandhi	aead502b11	[Attributor] Add convergent abstract attribute This patch adds the AANonConvergent abstract attribute. It removes the convergent attribute from functions that only call non-convergent functions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143228	2023-03-20 22:33:50 -07:00
Johannes Doerfert	8f47fd05d5	[OpenMPOpt][FIX] Avoid removing barriers in callees We could be smarter about this, e.g., if the callee has a single call site, but for now we first avoid the miscompile.	2023-03-20 17:44:24 -07:00
Johannes Doerfert	b89558a2ae	[OpenMP][FIX] Properly track and lookup Execution Domains This is a two part fix. First, we need two Execution Domains (ED) to track the values of a function. One for incoming values and one for outgoing values. This was conflated before. Second, at the function entry we need to look at the incoming information from call sites not iterate over non-existing predecessors.	2023-03-20 17:44:24 -07:00
Johannes Doerfert	578d507359	[OpenMP][FIX] Ensure to determine aligned regions properly There were missing checks in the aligned region code, copy-paste errors (= usage of the IsReachedFromAlignedBarrierOnly value instead of IsReachingAlignedBarrierOnly value on the forward pass), and a missing update of the call state for sync declarations and definitions. Partially fixes https://github.com/llvm/llvm-project/issues/60425	2023-02-02 02:28:10 -08:00
Johannes Doerfert	18a2975b57	[Attributor][FIX] Ensure we use the right AAExecutionDomain Before we might have ended up queriying the AAExecutionDomain of a different function, which resulted in wrong optimistic results. Partially fixes https://github.com/llvm/llvm-project/issues/60425	2023-02-02 02:27:54 -08:00
Guillaume Chatelet	ffc1205bde	[reland][NFC] Transition GlobalObject alignment from MaybeAlign to Align This is a follow up on https://reviews.llvm.org/D142459#4081179. This first patch adds an overload to `GlobalObject::setAlignment` that accepts an `Align` type. This already handles most of the calls. This patch also converts a few call sites to the new type when this is safe. Here is the list of the remaining call sites: - [clang/lib/CodeGen/CodeGenModule.cpp:1688](`e195e6bad6/clang/lib/CodeGen/CodeGenModule.cpp (L1688)`) - [llvm/lib/AsmParser/LLParser.cpp:1309](`e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L1309)`) - [llvm/lib/AsmParser/LLParser.cpp:6050](`e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L6050)`) - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:3871](`e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L3871)`) - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:4030](`e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L4030)`) - [llvm/lib/IR/Core.cpp:2018](`e195e6bad6/llvm/lib/IR/Core.cpp (L2018)`) - [llvm/lib/IR/Globals.cpp:141](`e195e6bad6/llvm/lib/IR/Globals.cpp (L141)`) - [llvm/lib/Linker/IRMover.cpp:660](`e195e6bad6/llvm/lib/Linker/IRMover.cpp (L660)`) - [llvm/lib/Linker/LinkModules.cpp:361](`e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L361)`) - [llvm/lib/Linker/LinkModules.cpp:362](`e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L362)`) - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:782](`e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L782)`) - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:840](`e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L840)`) - [llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1813](`e195e6bad6/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (L1813)`) - [llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp:27](`e195e6bad6/llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp (L27)`) Differential Revision: https://reviews.llvm.org/D142708	2023-01-31 15:09:10 +00:00
Guillaume Chatelet	e098ee726e	Revert D142708 "[NFC] Transition GlobalObject alignment from MaybeAlign to Align" This is breaking the build bots. e.g., https://lab.llvm.org/buildbot/#/builders/121/builds/27549 This reverts commit 6717efe74da825214cb4d307ad35e5fbda353301.	2023-01-31 14:12:51 +00:00

1 2 3 4 5 ...

324 Commits