llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	232ce90541	[OpenMP][FIX] Adjust "known" attributes for runtime functions This showed up when we started to deduce readnone for the argument of __kmpc_global_thread_num. The known attributes for "getters" did not allow to read arguments, but that is sometimes the case.	2023-07-14 17:01:48 -07:00
Johannes Doerfert	4dc5662c27	[Attributor][NFC] Update all tests with the script Three tests needed manual adjustment after https://reviews.llvm.org/D148216 got reverted. See https://github.com/llvm/llvm-project/issues/63746.	2023-07-14 13:53:38 -07:00
Matt Arsenault	357d19a8fd	OpenMP: Convert some tests to opaque pointers	2023-07-11 18:03:20 -04:00
Johannes Doerfert	02a4fcec6b	[Attributor] Port AANonNull to the isImpliedByIR interface AANonNull is now the first AA that is always queried via the new APIs and not created manually. Others will follow shortly to avoid trivial AAs whenever possible. This commit introduced some helper logic that will make it simpler to port the next one. It also untangles AADereferenceable and AANonNull such that the former does not keep a handle on the latter. Finally, we stop deducing `nonnull` for `undef`, which was incorrect.	2023-07-09 16:04:19 -07:00
Johannes Doerfert	fe12d313ba	[OpenMPOpt][FIX] Propagate IsReachingAlignedBarrier flag through calls	2023-07-07 16:38:34 -07:00
Johannes Doerfert	7e77e812ab	[Attributor][FIX] Require the store to be aligned for value propagation	2023-07-07 16:38:34 -07:00
Johannes Doerfert	24656e995a	[OpenMPOpt] The kernel end is not necessarily an aligned barrier A kernel can be exited in a non-aligned fashion, so we cannot pretend it always ends in an aligned barrier. Instead, we require an explicit aligned barrier as we lack a divergence analysis at this point.	2023-07-07 16:38:34 -07:00
Johannes Doerfert	4009f84d2d	[OpenMPOpt] Check for execution with an aligned barrier If the next or last synchronizing instruction was an aligned barrier, the instruction is executed in an aligned region.	2023-07-07 16:38:33 -07:00
Johannes Doerfert	3a3ea43078	[OpenMPOpt][NFC] Precommit test for AAExecutionDomain bug	2023-07-07 16:38:33 -07:00
Johannes Doerfert	77dbd1d712	[Attributor][NFCI] Manifest assumption attributes explicitly We had some custom manifest for assumption attributes but we use the generic manifest logic. If we later decide to curb duplication (of attributes on the call site and callee), we can do that at a single location and for all attributes. The test changes basically add known `llvm.assume` callee information to the call sites.	2023-07-03 11:57:29 -07:00
Johannes Doerfert	b672c602c7	[Attributor][NFCI] Merge MemoryEffects explicitly We had some custom handling for existing MemoryEffects but we now move it to the place we check other existing attributes before we manifest new ones. If we later decide to curb duplication (of attributes on the call site and callee), we can do that at a single location and for all attributes. The test changes basically add known `memory` callee information to the call sites.	2023-07-03 11:57:29 -07:00
Johannes Doerfert	d33bca840a	[Attributor] Introduce helpers to judge AAs prior to creation This is a partial cleanup to centralize the initialization and update decisions for AAs. Lifting the burdon and boilerplate on users and making it harder to accidentally perform unsound deductions. The two static helpers show how we can lift the decisions to generate an AA into the Attributor, avoiding trivial AAs that just cost us compile time and maintenance code (to check for pre-conditions).	2023-06-29 12:32:45 -07:00
Johannes Doerfert	339a1f3ce3	[Attributor] Avoid more AAs through IR implication	2023-06-24 00:35:31 -07:00
Johannes Doerfert	732bdb6073	[Attributor] Avoid the type check in getCalledFunction We now consistently use `CallBase::getCalledOperand` rather than `getCalledFunction`, as we do not want the type checked performed by the latter. This exposed various missing checks to handle mismatches properly, but it is good to have them explicit now. In a follow up we might want to flag certain calls as UB, but for now, we allow everything to cut down on unexpected differences.	2023-06-23 20:10:12 -07:00
Johannes Doerfert	badafc53c6	[Attributor] Check IR attributes before creating new AAs Instead of creating an AA for an IR attribute we can first check if it is implied/known. If so, we can save the time to create the AA, figure out it is implied, fix it, and later manifest it in the IR (redundantly). Other IR attributes can be added to the list in `AA::hasAssumedIRAttr` later on, for now we support 8 different ones.	2023-06-23 17:21:21 -07:00
Johannes Doerfert	cb17c48fdd	[Attributor] Identify and remove no-op fences The logic and implementation follows the removal of no-op barriers. If the fence is not making updates visible, either to the world or the current thread, it is not needed. Said differently, the fences we remove do not establish synchronization (happens-before) edges. This allows us to eliminate some of the regression caused by: https://reviews.llvm.org/D145290	2023-06-05 17:14:00 -07:00
Johannes Doerfert	8f4fadd1b4	[OpenMP] Use "kernel" attribute consistently	2023-06-05 16:33:53 -07:00
Johannes Doerfert	dbbe9b3776	[Attributor] Create `AAMustProgress` for the `mustprogress` attribute Derive the mustprogress attribute based on the willreturn attribute or the fact that all callers are mustprogress. Differential Revision: https://reviews.llvm.org/D94740	2023-06-05 16:33:52 -07:00
Johannes Doerfert	787d6bb59f	[Attributor][OpenMP-Opt][NFC] Run the update test checks script	2023-05-18 13:27:44 -07:00
Joseph Huber	e494ebf9d0	[OpenMP] Fix incorrect interop type for number of dependencies The interop types use the number of dependencies in the function interface. Every other function uses an `i32` to count the number of dependencies except for the initialization function. This leads to codegen issues when the rest of the compiler passes in an `i32` that then creates an invalid call. Fix this to be consistent with the other uses. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D150156	2023-05-08 21:02:43 -05:00
Krzysztof Drewniak	f0415f2a45	Re-land "[AMDGPU] Define data layout entries for buffers"" Re-land D145441 with data layout upgrade code fixed to not break OpenMP. This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27. Differential Revision: https://reviews.llvm.org/D149776	2023-05-03 19:43:56 +00:00
Krzysztof Drewniak	3f2fbe92d0	Revert "[AMDGPU] Define data layout entries for buffers" This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850. Differential Revision: https://reviews.llvm.org/D149758	2023-05-03 16:11:00 +00:00
Krzysztof Drewniak	f9c1ede254	[AMDGPU] Define data layout entries for buffers Per discussion at https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798, we define two new address spaces for AMDGCN targets. The first is address space 7, a non-integral address space (which was already in the data layout) that has 160-bit pointers (which are 256-bit aligned) and uses a 32-bit offset. These pointers combine a 128-bit buffer descriptor and a 32-bit offset, and will be usable with normal LLVM operations (load, store, GEP). However, they will be rewritten out of existence before code generation. The second of these is address space 8, the address space for "buffer resources". These will be used to represent the resource arguments to buffer instructions, and new buffer intrinsics will be defined that take them instead of <4 x i32> as resource arguments. ptr addrspace(8). These pointers are 128-bits long (with the same alignment). They must not be used as the arguments to getelementptr or otherwise used in address computations, since they can have arbitrarily complex inherent addressing semantics that can't be represented in LLVM. Even though, like their address space 7 cousins, these pointers have deterministic ptrtoint/inttoptr semantics, they are defined to be non-integral in order to prevent optimizations that rely on pointers being a [0, [addr_max]] value from applying to them. Future work includes: - Defining new buffer intrinsics that take ptr addrspace(8) resources. - A late rewrite to turn address space 7 operations into buffer intrinsics and offset computations. This commit also updates the "fallback address space" for buffer intrinsics to the buffer resource, and updates the alias analysis table. Depends on D143437 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D145441	2023-05-03 15:25:58 +00:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Joseph Huber	46ee1021d9	[OpenMP] Replace HeapToShared's initial value with `poison` There's a desire to move away from `undef` in LLVM. Currently we want to have the `addressspace(3)` variables use `poison` instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D147719	2023-04-14 09:39:32 -05:00
Johannes Doerfert	94d14536a9	[OpenMP][FIX] More AAExecutionDomain fixes We missed certain updates, mostly to call site information, and dependent AAs did not get recomputed. We also did not properly distinguish and propagate incoming and outgoing information of call sites. The runtime tests passes now, I'll add a proper test for AAExecutionDomain soon that covers all the cases and ensures we haven't forgotten more updates. To help unblock some apps, I'll put the fix first.	2023-03-27 21:36:21 -07:00
Johannes Doerfert	7f7e1749c5	[OpenMP] Be smarter about the insertion point for deduplication We can use dominance and avoid the special handling of kernels and prevent inserting code before allocas accidentally (as happend in the runtime test).	2023-03-27 21:30:23 -07:00
Ishaan Gandhi	aead502b11	[Attributor] Add convergent abstract attribute This patch adds the AANonConvergent abstract attribute. It removes the convergent attribute from functions that only call non-convergent functions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143228	2023-03-20 22:33:50 -07:00
Johannes Doerfert	8f47fd05d5	[OpenMPOpt][FIX] Avoid removing barriers in callees We could be smarter about this, e.g., if the callee has a single call site, but for now we first avoid the miscompile.	2023-03-20 17:44:24 -07:00
Johannes Doerfert	b89558a2ae	[OpenMP][FIX] Properly track and lookup Execution Domains This is a two part fix. First, we need two Execution Domains (ED) to track the values of a function. One for incoming values and one for outgoing values. This was conflated before. Second, at the function entry we need to look at the incoming information from call sites not iterate over non-existing predecessors.	2023-03-20 17:44:24 -07:00
Johannes Doerfert	0fc63d4e64	[Attributor][FIX] Ensure loop PHI replacements are dynamically unique Similar to loads, PHIs can be used to introduce non-dynamically unique values into the simplification "algorithm". We need to check that PHIs do not carry such a value from one iteration into the next as can cause downstream reasoning to fail, e.g., downstream could think a comparison is equal because the simplified values are equal while they are defined in different loop iterations. Similarly, instructions in cycles are now conservatively treated as non-dynamically unique. We could do better but I'll leave that for the future. The change in AAUnderlyingObjects allows us to ignore dynamically unique when we simply look for underlying objects. The user of that AA should be aware that the result might not be a dynamically unique value.	2023-03-20 17:44:24 -07:00
Matt Arsenault	25a461046e	OpenMP: Regenerate test checks	2023-02-16 22:40:15 -04:00
Johannes Doerfert	578d507359	[OpenMP][FIX] Ensure to determine aligned regions properly There were missing checks in the aligned region code, copy-paste errors (= usage of the IsReachedFromAlignedBarrierOnly value instead of IsReachingAlignedBarrierOnly value on the forward pass), and a missing update of the call state for sync declarations and definitions. Partially fixes https://github.com/llvm/llvm-project/issues/60425	2023-02-02 02:28:10 -08:00
Joseph Huber	0bdde9dfb9	[OpenMP] Make OpenMPOpt aware of the OpenMP runtime's status The `OpenMPOpt` pass contains optimizations that generate new calls into the OpenMP runtime. This causes problems if we are in a state where the runtime has already been linked statically. Generating these new calls will result in them never being resolved. We should indicate if we are in a "post-link" LTO phase and prevent OpenMPOpt from generating new runtime calls. Generally, it's not desireable for passes to maintain state about the context in which they're called. But this is the only reasonable solution to static linking when we have a pass that generates new runtime calls. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142646	2023-01-26 13:23:44 -06:00
Johannes Doerfert	5238df7ed5	[Attributor] Allow (inter-procedural) "CFG" reasoning for aligned regions If an instruction is executed in an aligned region we can ignore threading effects and use CFG reasoning (dominance and reachability). This is true because all threads are together in an aligned region and there cannot be one waiting for a signal at a place not connected via the control flow. More dedicated tests will follow. More details can be found here: "Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution", IPDPS 2022, https://www.osti.gov/servlets/purl/1890094	2023-01-23 22:45:48 -08:00
Johannes Doerfert	fedbc689e1	[Attributor] Check assumptions to improve `isAlignedBarrier` queries	2023-01-23 20:34:26 -08:00
Johannes Doerfert	129faec711	[OpenMP] Identify non-aligned barriers executed in an aligned context Even if a barrier does not enforce aligned execution, it will effectively be like an aligned barrier if it is executed by all threads in an aligned way. We lack control flow divergence analysis here so we can only do (basic block) local reasoning for now.	2023-01-22 21:42:07 -08:00
Johannes Doerfert	43c1c59f73	[OpenMP] Merge barrier elimination into AAExecutionDomain With this patch we track aligned barriers in AAExecutionDomain and also delete unnecessary barriers there. This allows us to eliminate barriers across blocks, across functions, and in the presence of complex accesses that do not force a barrier. Further, we can use the collected information to enable store-load forwarding in a threaded environment (follow up patch). Differential Revision: https://reviews.llvm.org/D140463	2023-01-22 16:34:59 -08:00
Johannes Doerfert	2275e325e4	[OpenMP] Guarding restrictions are required only for guarding If we do not guard code during SPMDzation, we do not need to check conditions for successfull guarding. That is, even if some code is executed in different modes, it does not prevent SPMDzation if there is no guarded code in there.	2023-01-22 15:53:42 -08:00
Johannes Doerfert	ea3c24932a	[OpenMP][FIX] Properly update ParallelLevels tracker	2023-01-22 15:52:45 -08:00
Johannes Doerfert	7bc88cbe5c	[OpenMP] Simplify `llvm.assume` operands in device code	2023-01-22 01:27:41 -08:00
Shilei Tian	bdf30603f2	[LLVM][OpenMP] Correct the function signature of `__kmpc_parallel_level` `__kmpc_parallel_level` used to be a function w/o any argument, but in the new device runtime, it accepts two. This patch simply corrects it in `OMPKinds.def`. ``` uint16_t __kmpc_parallel_level(IdentTy *Loc, uint32_t); ``` Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D141655	2023-01-20 09:46:45 -05:00
Jonas Paulsson	dc3875e468	Add parameter extension attributes in various instrumentation passes. For the targets that have in their ABI the requirement that arguments and return values are extended to the full register bitwidth, it is important that calls when built also take care of this detail. The OMPIRBuilder, AddressSanitizer, GCOVProfiling, MemorySanitizer and ThreadSanitizer passes are with this patch hopefully now doing this properly. Reviewed By: Eli Friedman, Ulrich Weigand, Johannes Doerfert Differential Revision: https://reviews.llvm.org/D133949	2023-01-18 18:29:12 -06:00
Johannes Doerfert	27944bbbe7	[Attributor][FIX] Avoid deleting (internal) library functions In CGSCC mode we cannot delete internal library functions, esp. __kmpc_alloc_shared, or we trigger an assertion. While the assertion is probably too narrow, we avoid deleting those unused functions for now to unblock the AMDGPU buildbot.	2023-01-12 01:17:23 -08:00
Johannes Doerfert	cefa5cefdc	[OpenMP] Replace ExternalizationRAII with virtual uses The externalization was always a stopgap solution. One of the drawbacks is that it is very conservative no matter if we actually require the functions at the end of the pass. The new concept is more generic and properly integrates into the dependence graph. Whenever we might need a function, it has a "virtual use" that cannot be analyzed. If we do not because of some AA state, there will be a dependence to ensure state changes trigger revisits of uses, including a potentially new virtual use.	2023-01-12 00:14:06 -08:00
Johannes Doerfert	96c335e2cc	[Attributor] Always ensure the correct AAIsDead object is used Since the Attributor::isAssumedDead lookups can jump between functions we need to potentially replace a given FnLivenessAA for it to be useful.	2023-01-11 23:49:09 -08:00
Johannes Doerfert	91f06dd732	[OpenMP][NFC] Include global alias test	2023-01-11 22:24:22 -08:00
Johannes Doerfert	cddcbfae14	[OpenMP][FIX] Avoid performance regression accidentally introduced	2023-01-11 00:58:34 -08:00
Johannes Doerfert	b2a8d2c69b	[OpenMP] Avoid running openmp-opt on dead functions The Attributor has logic to run only on assumed live functions and this is exposed to users now. OpenMP-opt will (mostly) ignore dead internal functions now but run the same deduction as before if an internal function is marked live. This should lower compile time as we run on less code and delete more code early on. For the full OpenMC module compiled with noinline and JITed at runtime, we save ~25%, or ~10s on my machine during JITing.	2023-01-10 15:03:51 -08:00

1 2 3 4 5 ...

309 Commits