llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	cefa5cefdc	[OpenMP] Replace ExternalizationRAII with virtual uses The externalization was always a stopgap solution. One of the drawbacks is that it is very conservative no matter if we actually require the functions at the end of the pass. The new concept is more generic and properly integrates into the dependence graph. Whenever we might need a function, it has a "virtual use" that cannot be analyzed. If we do not because of some AA state, there will be a dependence to ensure state changes trigger revisits of uses, including a potentially new virtual use.	2023-01-12 00:14:06 -08:00
Johannes Doerfert	cddcbfae14	[OpenMP][FIX] Avoid performance regression accidentally introduced	2023-01-11 00:58:34 -08:00
Johannes Doerfert	b2a8d2c69b	[OpenMP] Avoid running openmp-opt on dead functions The Attributor has logic to run only on assumed live functions and this is exposed to users now. OpenMP-opt will (mostly) ignore dead internal functions now but run the same deduction as before if an internal function is marked live. This should lower compile time as we run on less code and delete more code early on. For the full OpenMC module compiled with noinline and JITed at runtime, we save ~25%, or ~10s on my machine during JITing.	2023-01-10 15:03:51 -08:00
Johannes Doerfert	c3de9c1c7b	[OpenMP] Ensure AAHeapToShared is only looking at one function When we collect and process allocations we did not verify the call against the anchor scope / associated function. This should be done to avoid processing calls multiple times and generally looking at calls not in the AAs scope.	2023-01-10 15:03:51 -08:00
Johannes Doerfert	d1033e3cad	[OpenMP] Disable ICV deduction by default. This is not tested well and needs to be revisited in the future.	2023-01-10 15:03:51 -08:00
Johannes Doerfert	22c898dbfd	[OpenMP] Use Attributor to find underlying objects of stores When we see a store in generic mode we need to decide if we should guard it for SPMDzation. This patch changes the getUnderlyingObjects call to the more optimistic getAssumedUnderlyingObjects call to identify more thread local pointers.	2023-01-09 23:34:52 -08:00
Rafael A Herrera Guaitero	13b909ef27	OpenMPOpt: Check nested parallelism in target region Analysis that determines if a parallel region can reach another parallel region in any target region of the TU. A new global var is emitted with the name of the kernel + "_nested_parallelism", which is either 0 or 1 depending on the result. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D141010	2023-01-09 15:55:30 -06:00
Matt Arsenault	2e7640e6dc	OpenMPOpt: Fix null dereference on missing declaration cache Found by llvm-reduce fuzzing.	2023-01-03 16:26:37 -05:00
Matt Arsenault	c3054aeb5a	OpenMPOpt: Fix using wrong address space for alloca Using the function's address space makes no sense. Copied from the existing test, with more addrspace variation. Could just replace the existing one with this version if it's redundant.	2023-01-03 16:26:37 -05:00
Matt Arsenault	a7425e299e	OpenMPOpt: Use getFnAttributeAsParsedInteger	2023-01-03 11:40:42 -05:00
Matt Arsenault	a87de3a6dc	OpenMPOpt: Fix introducing empty nvvm.annotations into module	2023-01-03 10:32:10 -05:00
Joseph Huber	7ae3db66e8	[OpenMP] Fix leftover use of removed function Summary: Didn't notice this one floating around as it was still cached somewhere. Delete it.	2022-12-20 13:43:00 -06:00
Joseph Huber	bb4c6e7a06	[OpenMP] Remove folding logic for removed runtime function This function was removed from the device runtime at some point but we still have specialized code for it and an entry in the runtime kinds. Remove it as it is no longer necessary. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D140402	2022-12-20 13:37:38 -06:00
Johannes Doerfert	d4f3d8212a	[OpenMP][FIX] Ensure to inline `ompx::` functions after the rename in D140334	2022-12-19 16:41:49 -08:00
Fangrui Song	21c4dc7997	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes clang.	2022-12-17 00:42:05 +00:00
Johannes Doerfert	dde21c1983	[OpenMP][FIX] Remove accidental and somewhat random change	2022-12-13 19:38:15 -08:00
Johannes Doerfert	90609fb68f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime Differential Revision: https://reviews.llvm.org/D136903	2022-12-13 18:44:19 -08:00
Johannes Doerfert	f9c29878b0	Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime" This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.	2022-12-12 22:08:28 -08:00
Johannes Doerfert	c1c8cbbf5f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime	2022-12-12 20:55:36 -08:00
Johannes Doerfert	2dd158d655	[OpenMP] Make barrier elimination work in the presence of llvm.assume Assumptions are droppable and eliminating them to eliminate barriers seems reasonable.	2022-12-07 22:37:57 -08:00
Kazu Hirata	1f421b6d7e	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-06 22:45:17 -08:00
Fangrui Song	75801e3b45	Transforms/IPO: llvm::Optional => std::optional	2022-12-05 07:07:19 +00:00
Kazu Hirata	3c09ed006a	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:12:44 -08:00
Kazu Hirata	343de6856e	[Transforms] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:37 -08:00
Krzysztof Parzyszek	467432899b	MemoryLocation: convert Optional to std::optional	2022-12-01 15:36:20 -08:00
LiaoChunyu	2c2c9688f0	[OpenMP][LegacyPM] Remove OpenMPOptCGSCCLegacyPass Using the legacy pass manager for the optimization pipeline is deprecated. I see the new PM is available. Reviewed By: aeubanks, jdoerfert Differential Revision: https://reviews.llvm.org/D139004	2022-12-01 09:21:10 +08:00
Johannes Rudolf Doerfert	41a278f56a	[OpenMP][FIX] Do not add custom state machine eagerly in LTO runs If we run LTO optimization we migth end up introducing a custom state machine and later transforming the region into SPMD. This is a problem. While a follow up will introduce a check for the SPMD conversion, this already prevents the eager custom state machine generation. Only if the kernel init function is defined, rather then declared, we will emit a custom state machine. SPMD-zation can happen eagerly though. Tests are adjusted via a weak definition. The LTO test was added to verify this works as expected. Differential Revision: https://reviews.llvm.org/D136740	2022-10-26 10:40:11 -07:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit 7539e9cf811e590d9f12ae39673ca789e26386b4.	2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Johannes Doerfert	21711039e3	[OpenMP] Allow the Attributor to look at functions we also internalized This is important as we have accesses to globals in those which we need to categorize.	2022-09-11 20:16:11 -07:00
Doru Bercea	0b1160fdeb	Fix OpenMP Opt for target without a parallel region. Remove ctx redeclaration. Format code. Remove parallel check. Modify tests. Clean-up code. Fix another test. Move code to helper functions. Format file. Minor fixes.	2022-09-06 16:04:53 +00:00
Joseph Huber	b08369f7f2	Revert "[OpenMP] Remove noinline attributes in the device runtime" The behaviour of this patch is not great, but it has some side-effects that are required for OpenMPOpt to work. The problem is that when we use `-mlink-builtin-bitcode` we only import used symbols from the runtime. Then OpenMPOpt will insert calls to symbols that were not previously included. This patch removed this implicit behaviour as these functions were kept alive by the `noinline` simply because it kept calls to them in the module. This caused regression in some tests that relied on some OpenMPOpt passes without using LTO. Reverting for the LLVM15 release but will try to fix it more correctly on main. This reverts commit d61d72dae604c3258e25c00622b1a85861450303. Fixes #56752	2022-07-27 11:09:18 -04:00
Joseph Huber	d61d72dae6	[OpenMP] Remove noinline attributes in the device runtime We previously used the `noinline` attributes to specify some defintions which should be kept alive in the runtime. These were then stripped immediately in the OpenMPOpt module pass. However, Since the changes in D130298, we not explicitly state which functions will have external visiblity in the bitcode library. Additionally the OpenMPOpt module pass should run before the inliner pass, so this shouldn't make a difference in whether or not the functions will be alive for the initial pass of OpenMPOpt. This should simplify the interface, and additionally save time spend on scanning funciton names for noinline. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130368	2022-07-25 15:44:50 -04:00
Johannes Doerfert	bf789b1957	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in 6555558a80589d1c5a1154b92cc3af9495f8f86c.	2022-07-19 16:24:42 -05:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Johannes Doerfert	f6e0c05e3d	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three AMDGPU tests haven't been updated. Will need to verify the changes are not regressions we should avoid.	2022-07-08 00:53:38 -05:00
Johannes Doerfert	f17639ea0c	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in 6555558a80589d1c5a1154b92cc3af9495f8f86c.	2022-07-08 00:38:27 -05:00
Johannes Doerfert	c771eaf07e	[OpenMP] Ensure to not use SPMD mode in the absence of parallel regions	2022-07-07 16:49:22 -05:00
Joseph Huber	c7243f21d3	[OpenMP] Only strip runtime attributes if needed Summary: Currently in OpenMPOpt we strip `noinline` attributes from runtime functions. This is here because the device bitcode library that we link has problems with needed definitions getting prematurely optimized out. This is only necessary for OpenMP offloading to GPUs so we should narrow the scope for where we spend time doing this. In the future this shouldn't be necessary as we move to using a linked library rather than pulling in a bitcode library in Clang.	2022-06-27 13:35:41 -04:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Kazu Hirata	ad7ce1e769	Don't use Optional::hasValue (NFC)	2022-06-20 11:49:10 -07:00
Kazu Hirata	5413bf1bac	Don't use Optional::hasValue (NFC)	2022-06-20 11:33:56 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Johannes Doerfert	6555558a80	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705. Patch broke AMD GPU OpenMP offload buildbots. https://lab.llvm.org/buildbot/#/builders/193/builds/13246	2022-06-09 17:04:01 +02:00
Johannes Doerfert	da50dab1ae	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good. Fixes: https://github.com/llvm/llvm-project/issues/54981	2022-06-09 16:48:53 +02:00
Johannes Doerfert	7a07b88f37	[Attributor][FIX] Replace call site argument uses, not values We need to be careful replacing values as call site arguments (IRPosition::IRP_CALL_SITE_ARGUMENT) is representing a use and not a value. This patch replaces the interface to take a IR position instead making it harder to misuse accidentally. It does not change our tests right now but a follow up exposed the potential footgun.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	481b8f31df	[Attributor][NFC] Introduce helper struct We often use a context associated with a value. For now only one use case has been changed.	2022-06-09 12:00:26 +02:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00

1 2 3 4 5 ...

262 Commits