332 Commits

Author SHA1 Message Date
Shilei Tian
22e1df7f5b
[LLVM][OpenMPOpt] Fix a crash when associated function is nullptr (#66274)
The associated function can be a nullptr if it is an indirect call.
This causes a crash in `CheckCallee` which always assumes the callee
is a valid pointer.

Fix #66904.
2023-09-13 20:22:59 -04:00
Johannes Doerfert
d47cf2bff3
[OpenMPOpt] Allow indirect calls in AAKernelInfoCallSite (#65836)
The Attributor has gained support for indirect calls but it is opt-in.
This patch makes AAKernelInfoCallSite able to handle multiple potential
callees.
2023-09-10 19:02:09 -07:00
Johannes Doerfert
67635b6e23 [OpenMPOpt][NFC] Precommit test 2023-09-08 22:40:33 -07:00
Shilei Tian
499f691be1 Revert "Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)"""
This reverts commit c5525a6e8fb7f7c2ce7126ac5b17aaff01ac407f.
AMD BB is not happy again.
2023-09-08 15:46:23 -04:00
Shilei Tian
c5525a6e8f Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)""
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4 that
reverts e91e3cf.
2023-09-08 15:39:16 -04:00
Shilei Tian
e592c2dcf5 Revert "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)"
This reverts commit e91e3cf0748a80e1d7219c13fa6a7622321f4936 because
AMD BB is not happy with it.
2023-09-07 12:31:11 -04:00
Shilei Tian
e91e3cf074
[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544) 2023-09-07 12:23:52 -04:00
Johannes Doerfert
8b08287cb3 [OpenMPOpt] Eliminate assumptions only "late"
When we remove barriers, we might need to remove llvm.assume
assumptions as well. However, doing this early, thus in the module pass,
will cause us to miss out on information we might need. There are few
situations we can eliminate barriers across functions, for now we simply
disable elimination of barriers that require assumptions to be removed
during the early module pass.
2023-08-23 16:11:43 -07:00
Johannes Doerfert
78b8f1f78f [Attributor][FIX] Remove the visited set from AAInterFnReachability
The visited set was used to not visit the same function twice, however,
the (new) algorithm requires we do since we start the queries at
different call sites.
2023-08-23 11:48:18 -07:00
Johannes Doerfert
fb0e49f230 [OpenMP] Add noalias to runtime allocator functions 2023-08-17 19:25:32 -07:00
Johannes Doerfert
bfa1afb81c [OpenMPOpt] Improve __kmpc_alloc_shared handling
We know that __kmpc_alloc_shared is by construction matched with a
unique __kmpc_free_shared. Making the compiler aware of these facts
helps to avoid mallocs/allocas.

Fixes: https://github.com/llvm/llvm-project/issues/64551
2023-08-17 19:25:32 -07:00
Johannes Doerfert
4fcd5f93d6 [OpenMPOpt] Mark more runtime functions as SPMD compatible
Fixes: https://github.com/llvm/llvm-project/issues/64421
2023-08-17 18:33:24 -07:00
Johannes Doerfert
2ece6d939b [OpenMPOpt] SPMD-amenable implies no unknown parallel regions 2023-08-17 18:33:23 -07:00
Johannes Doerfert
dfc821ae89 [OpenMPOpt][FIX] Ensure a dependence for KernelEnvC queries
When other AAs query the current value of KernelEnvC via the callback
KernelConfigurationSimplifyCB we need to ensure they are now dependent
on the AAKernelInfo that is in charge of the KernelEnvC.
2023-08-10 23:16:25 -07:00
Johannes Doerfert
27f9a26668 [OpenMP][NFC] Precommit reduced test 2023-08-10 23:16:24 -07:00
Johannes Doerfert
fa367d159a [IR] Mark llvm.assume as memory(inaccessiblemem: write)
It was `inaccessiblemem: readwrite` before, no need for the read.
No real benefit is expected but it can help debugging and other efforts.

Differential Revision: https://reviews.llvm.org/D156478
2023-07-31 13:44:52 -07:00
Shilei Tian
10068cd654 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-26 13:35:14 -04:00
Johannes Doerfert
88b5d23021 [Attributor] Allow multiple LHS/RHS values when simplifying comparisons
We use to deal with multiple values but not in the handleCmp function.
Now we also allow multiple simplified operands there.
2023-07-25 20:31:21 -07:00
Johannes Doerfert
0cd8a28941 [Attributor][FIX] No IntraFnReachability does not mean unreachable
Also, first check inter fn reachability as it seems to be cheaper in
practise.
2023-07-25 17:47:33 -07:00
Johannes Doerfert
4223c9b354 [Attributor] Always deduce nosync from readonly + non-convergent
This adds the deduction also if the function is not IPO amendable.
2023-07-25 17:47:33 -07:00
Joseph Huber
05b181d851 [OpenMP] Make the nested parallelism global hidden
Summary:
These will probably be removed with the kernel environment, but they
should have hidden visibliity so they can be optimized out.
2023-07-24 08:28:54 -05:00
Shilei Tian
6bd74fd65f Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
2023-07-23 23:32:31 -04:00
Shilei Tian
c5c8040390 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-23 18:36:01 -04:00
Johannes Doerfert
232ce90541 [OpenMP][FIX] Adjust "known" attributes for runtime functions
This showed up when we started to deduce readnone for the argument of
__kmpc_global_thread_num. The known attributes for "getters" did not
allow to read arguments, but that is sometimes the case.
2023-07-14 17:01:48 -07:00
Johannes Doerfert
4dc5662c27 [Attributor][NFC] Update all tests with the script
Three tests needed manual adjustment after
https://reviews.llvm.org/D148216 got reverted. See
https://github.com/llvm/llvm-project/issues/63746.
2023-07-14 13:53:38 -07:00
Matt Arsenault
357d19a8fd OpenMP: Convert some tests to opaque pointers 2023-07-11 18:03:20 -04:00
Johannes Doerfert
02a4fcec6b [Attributor] Port AANonNull to the isImpliedByIR interface
AANonNull is now the first AA that is always queried via the new APIs
and not created manually. Others will follow shortly to avoid trivial
AAs whenever possible.

This commit introduced some helper logic that will make it simpler to
port the next one. It also untangles AADereferenceable and AANonNull
such that the former does not keep a handle on the latter. Finally,
we stop deducing `nonnull` for `undef`, which was incorrect.
2023-07-09 16:04:19 -07:00
Johannes Doerfert
fe12d313ba [OpenMPOpt][FIX] Propagate IsReachingAlignedBarrier flag through calls 2023-07-07 16:38:34 -07:00
Johannes Doerfert
7e77e812ab [Attributor][FIX] Require the store to be aligned for value propagation 2023-07-07 16:38:34 -07:00
Johannes Doerfert
24656e995a [OpenMPOpt] The kernel end is not necessarily an aligned barrier
A kernel can be exited in a non-aligned fashion, so we cannot pretend it
always ends in an aligned barrier. Instead, we require an explicit
aligned barrier as we lack a divergence analysis at this point.
2023-07-07 16:38:34 -07:00
Johannes Doerfert
4009f84d2d [OpenMPOpt] Check for execution with an aligned barrier
If the next or last synchronizing instruction was an aligned barrier,
the instruction is executed in an aligned region.
2023-07-07 16:38:33 -07:00
Johannes Doerfert
3a3ea43078 [OpenMPOpt][NFC] Precommit test for AAExecutionDomain bug 2023-07-07 16:38:33 -07:00
Johannes Doerfert
77dbd1d712 [Attributor][NFCI] Manifest assumption attributes explicitly
We had some custom manifest for assumption attributes but we use the
generic manifest logic. If we later decide to curb duplication (of
attributes on the call site and callee), we can do that at a single
location and for all attributes.

The test changes basically add known `llvm.assume` callee information to
the call sites.
2023-07-03 11:57:29 -07:00
Johannes Doerfert
b672c602c7 [Attributor][NFCI] Merge MemoryEffects explicitly
We had some custom handling for existing MemoryEffects but we now move
it to the place we check other existing attributes before we manifest
new ones. If we later decide to curb duplication (of attributes on the
call site and callee), we can do that at a single location and for all
attributes.

The test changes basically add known `memory` callee information to the
call sites.
2023-07-03 11:57:29 -07:00
Johannes Doerfert
d33bca840a [Attributor] Introduce helpers to judge AAs prior to creation
This is a partial cleanup to centralize the initialization and update
decisions for AAs. Lifting the burdon and boilerplate on users and
making it harder to accidentally perform unsound deductions.

The two static helpers show how we can lift the decisions to generate an
AA into the Attributor, avoiding trivial AAs that just cost us compile
time and maintenance code (to check for pre-conditions).
2023-06-29 12:32:45 -07:00
Johannes Doerfert
339a1f3ce3 [Attributor] Avoid more AAs through IR implication 2023-06-24 00:35:31 -07:00
Johannes Doerfert
732bdb6073 [Attributor] Avoid the type check in getCalledFunction
We now consistently use `CallBase::getCalledOperand` rather than
`getCalledFunction`, as we do not want the type checked performed by the
latter. This exposed various missing checks to handle mismatches
properly, but it is good to have them explicit now.

In a follow up we might want to flag certain calls as UB, but for now,
we allow everything to cut down on unexpected differences.
2023-06-23 20:10:12 -07:00
Johannes Doerfert
badafc53c6 [Attributor] Check IR attributes before creating new AAs
Instead of creating an AA for an IR attribute we can first check if it
is implied/known. If so, we can save the time to create the AA, figure
out it is implied, fix it, and later manifest it in the IR
(redundantly). Other IR attributes can be added to the list in
`AA::hasAssumedIRAttr` later on, for now we support 8 different ones.
2023-06-23 17:21:21 -07:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
Johannes Doerfert
8f4fadd1b4 [OpenMP] Use "kernel" attribute consistently 2023-06-05 16:33:53 -07:00
Johannes Doerfert
dbbe9b3776 [Attributor] Create AAMustProgress for the mustprogress attribute
Derive the mustprogress attribute based on the willreturn attribute
or the fact that all callers are mustprogress.

Differential Revision: https://reviews.llvm.org/D94740
2023-06-05 16:33:52 -07:00
Johannes Doerfert
787d6bb59f [Attributor][OpenMP-Opt][NFC] Run the update test checks script 2023-05-18 13:27:44 -07:00
Joseph Huber
e494ebf9d0 [OpenMP] Fix incorrect interop type for number of dependencies
The interop types use the number of dependencies in the function
interface. Every other function uses an `i32` to count the number of
dependencies except for the initialization function. This leads to
codegen issues when the rest of the compiler passes in an `i32` that
then creates an invalid call. Fix this to be consistent with the other
uses.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D150156
2023-05-08 21:02:43 -05:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Shilei Tian
d4ecd1241c Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.

It makes a couple of buildbots unhappy because of the following test failures:
- `Transforms/OpenMP/add_attributes.ll'`
- `mapping/declare_mapper_target_data.cpp` on AMDGPU
2023-04-22 20:56:35 -04:00
Shilei Tian
35cfadfbe2 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-04-22 20:46:38 -04:00
Joseph Huber
46ee1021d9 [OpenMP] Replace HeapToShared's initial value with poison
There's a desire to move away from `undef` in LLVM. Currently we want to
have the `addressspace(3)` variables use `poison` instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D147719
2023-04-14 09:39:32 -05:00
Johannes Doerfert
94d14536a9 [OpenMP][FIX] More AAExecutionDomain fixes
We missed certain updates, mostly to call site information, and
dependent AAs did not get recomputed. We also did not properly
distinguish and propagate incoming and outgoing information of call
sites.

The runtime tests passes now, I'll add a proper test for
AAExecutionDomain soon that covers all the cases and ensures we haven't
forgotten more updates. To help unblock some apps, I'll put the fix
first.
2023-03-27 21:36:21 -07:00