316 Commits

Author SHA1 Message Date
Johannes Doerfert
a01398156a [OpenMPOpt][FIX] Ensure to propagate information about parallel regions
Before, we checked the parallel region only once, and ignored updates in
the KernelInfo for the parallel region that happened later. This caused
us to think nested parallel sections are not present even if they are,
among other things.
2023-08-25 10:46:56 -07:00
Johannes Doerfert
8b08287cb3 [OpenMPOpt] Eliminate assumptions only "late"
When we remove barriers, we might need to remove llvm.assume
assumptions as well. However, doing this early, thus in the module pass,
will cause us to miss out on information we might need. There are few
situations we can eliminate barriers across functions, for now we simply
disable elimination of barriers that require assumptions to be removed
during the early module pass.
2023-08-23 16:11:43 -07:00
Johannes Doerfert
9c08e76f3e [Attributor] Introduce AAIndirectCallInfo
AAIndirectCallInfo will collect information and specialize indirect call
sites. It is similar to our IndirectCallPromotion but runs as part of
the Attributor (so with assumed callee information). It also expands
more calls and let's the rest of the pipeline figure out what is UB, for
now. We use existing call promotion logic to improve the result,
otherwise we rely on the (implicit) function pointer cast.

This effectively "fixes" #60327 as it will undo the type punning early
enough for the inliner to work with the (now specialized, thus direct)
call.

Fixes: https://github.com/llvm/llvm-project/issues/60327
2023-08-18 16:44:05 -07:00
Johannes Doerfert
97c24a16fd [OpenMPOpt][NFC] Allow missing wrapper functions for parallel_51
Clang does not create a wrapper function for SPMD kernels. If it does
not, we still want to collect the parallel region, even if we have no
use for it right now.
2023-08-17 18:33:24 -07:00
Johannes Doerfert
4fcd5f93d6 [OpenMPOpt] Mark more runtime functions as SPMD compatible
Fixes: https://github.com/llvm/llvm-project/issues/64421
2023-08-17 18:33:24 -07:00
Johannes Doerfert
2ece6d939b [OpenMPOpt] SPMD-amenable implies no unknown parallel regions 2023-08-17 18:33:23 -07:00
Johannes Doerfert
dfc821ae89 [OpenMPOpt][FIX] Ensure a dependence for KernelEnvC queries
When other AAs query the current value of KernelEnvC via the callback
KernelConfigurationSimplifyCB we need to ensure they are now dependent
on the AAKernelInfo that is in charge of the KernelEnvC.
2023-08-10 23:16:25 -07:00
Bjorn Pettersson
fd05c34b18 Stop using legacy helpers indicating typed pointer types. NFC
Since we no longer support typed LLVM IR pointer types, the code can
be simplified into for example using PointerType::get directly instead
of using Type::getInt8PtrTy and Type::getInt32PtrTy etc.

Differential Revision: https://reviews.llvm.org/D156733
2023-08-02 12:08:37 +02:00
Shilei Tian
10068cd654 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-26 13:35:14 -04:00
Joseph Huber
05b181d851 [OpenMP] Make the nested parallelism global hidden
Summary:
These will probably be removed with the kernel environment, but they
should have hidden visibliity so they can be optimized out.
2023-07-24 08:28:54 -05:00
Shilei Tian
6bd74fd65f Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
2023-07-23 23:32:31 -04:00
Shilei Tian
c5c8040390 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-23 18:36:01 -04:00
Nikita Popov
7be7f23269 [llvm] Remove uses of getWithSamePointeeType() (NFC) 2023-07-18 12:07:09 +02:00
Johannes Doerfert
02a4fcec6b [Attributor] Port AANonNull to the isImpliedByIR interface
AANonNull is now the first AA that is always queried via the new APIs
and not created manually. Others will follow shortly to avoid trivial
AAs whenever possible.

This commit introduced some helper logic that will make it simpler to
port the next one. It also untangles AADereferenceable and AANonNull
such that the former does not keep a handle on the latter. Finally,
we stop deducing `nonnull` for `undef`, which was incorrect.
2023-07-09 16:04:19 -07:00
Johannes Doerfert
fe12d313ba [OpenMPOpt][FIX] Propagate IsReachingAlignedBarrier flag through calls 2023-07-07 16:38:34 -07:00
Johannes Doerfert
24656e995a [OpenMPOpt] The kernel end is not necessarily an aligned barrier
A kernel can be exited in a non-aligned fashion, so we cannot pretend it
always ends in an aligned barrier. Instead, we require an explicit
aligned barrier as we lack a divergence analysis at this point.
2023-07-07 16:38:34 -07:00
Johannes Doerfert
4009f84d2d [OpenMPOpt] Check for execution with an aligned barrier
If the next or last synchronizing instruction was an aligned barrier,
the instruction is executed in an aligned region.
2023-07-07 16:38:33 -07:00
Johannes Doerfert
5faa616fe4 [Attributor][NFCI] Remove the (already "unused") ModuleSlice
At some point we alloed the CGSCC traversal to look at the entire module
slice (see definition below). However, we don't allow that anymore,
mostly for compile time and complexity reasons. Consequently, there is
no need to build the ModuleSlice as we can replacve it with the SCC
wherever it was still used.
2023-06-29 23:08:11 -07:00
Johannes Doerfert
e962fa7712 [OpenMPOpt][FIX] Internalization is an IR change too
The bots reported that we changed the IR w/o reporting it. The reason
was that internalization was not reported as changed. Forwarding the
result solves the problem.

Test coverage via
llvm/test/Transforms/Attributor/reduced/openmp_opt_constant_type_crash.ll
2023-06-29 18:03:47 -07:00
Haojian Wu
4b47c6e018 Fix -Wunused-variable in release build. 2023-06-30 00:02:05 +02:00
Johannes Doerfert
d6fa3b374f [Attributor] Remove now obsolete initialization code
With the helpers in place to judge AAs [1] we can remove the custom
rolled initialization checking code. This exposed a minor oversight in
the AAMemoryLocation where we did not check the IR before we gave up for
a declaration.

[1] d33bca840a
2023-06-29 13:32:06 -07:00
Johannes Doerfert
d33bca840a [Attributor] Introduce helpers to judge AAs prior to creation
This is a partial cleanup to centralize the initialization and update
decisions for AAs. Lifting the burdon and boilerplate on users and
making it harder to accidentally perform unsound deductions.

The two static helpers show how we can lift the decisions to generate an
AA into the Attributor, avoiding trivial AAs that just cost us compile
time and maintenance code (to check for pre-conditions).
2023-06-29 12:32:45 -07:00
Johannes Doerfert
21c0d6bff9 [OpenMPOpt] Properly check AA pointers
The interface was changed to return pointers, so we need to check them
for null now at they might actually be null in the future).
2023-06-29 09:18:36 -07:00
Elliot Goodrich
b0abd4893f [llvm] Add missing StringExtras.h includes
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
2023-06-25 15:42:22 +01:00
Johannes Doerfert
e9fc399db3 [Attributor][NFCI] Use pointers to pass around AAs
This will make it easier to create less trivial AAs in the future as we
can simply return `nullptr` rather than an AA with in invalid state.
2023-06-23 17:21:20 -07:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
Johannes Doerfert
8f4fadd1b4 [OpenMP] Use "kernel" attribute consistently 2023-06-05 16:33:53 -07:00
Shilei Tian
d4ecd1241c Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.

It makes a couple of buildbots unhappy because of the following test failures:
- `Transforms/OpenMP/add_attributes.ll'`
- `mapping/declare_mapper_target_data.cpp` on AMDGPU
2023-04-22 20:56:35 -04:00
Shilei Tian
35cfadfbe2 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-04-22 20:46:38 -04:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00
Joseph Huber
46ee1021d9 [OpenMP] Replace HeapToShared's initial value with poison
There's a desire to move away from `undef` in LLVM. Currently we want to
have the `addressspace(3)` variables use `poison` instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D147719
2023-04-14 09:39:32 -05:00
Johannes Doerfert
94d14536a9 [OpenMP][FIX] More AAExecutionDomain fixes
We missed certain updates, mostly to call site information, and
dependent AAs did not get recomputed. We also did not properly
distinguish and propagate incoming and outgoing information of call
sites.

The runtime tests passes now, I'll add a proper test for
AAExecutionDomain soon that covers all the cases and ensures we haven't
forgotten more updates. To help unblock some apps, I'll put the fix
first.
2023-03-27 21:36:21 -07:00
Johannes Doerfert
3a7cb3d45a [OpenMP] Adjust generic state machine simplification CB
This callback caused us to potentially miss out on call edges if we were
expecting a custom state machine since the custom state machine was not
created but the workers also did not enter the generic one. I have not
observed an issue and don't know how to create a test for sure, but it
is saver to err on the conservative side for now.
2023-03-27 21:30:23 -07:00
Johannes Doerfert
7f7e1749c5 [OpenMP] Be smarter about the insertion point for deduplication
We can use dominance and avoid the special handling of kernels and
prevent inserting code before allocas accidentally (as happend in the
runtime test).
2023-03-27 21:30:23 -07:00
Kazu Hirata
b9c4b95b11 [llvm] Use ConstantInt::{isZero,isOne} (NFC) 2023-03-21 17:40:35 -07:00
Ishaan Gandhi
aead502b11 [Attributor] Add convergent abstract attribute
This patch adds the AANonConvergent abstract attribute. It removes the
convergent attribute from functions that only call non-convergent
functions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D143228
2023-03-20 22:33:50 -07:00
Johannes Doerfert
8f47fd05d5 [OpenMPOpt][FIX] Avoid removing barriers in callees
We could be smarter about this, e.g., if the callee has a single call
site, but for now we first avoid the miscompile.
2023-03-20 17:44:24 -07:00
Johannes Doerfert
b89558a2ae [OpenMP][FIX] Properly track and lookup Execution Domains
This is a two part fix. First, we need two Execution Domains (ED) to
track the values of a function. One for incoming values and one for
outgoing values. This was conflated before. Second, at the function
entry we need to look at the incoming information from call sites not
iterate over non-existing predecessors.
2023-03-20 17:44:24 -07:00
Johannes Doerfert
578d507359 [OpenMP][FIX] Ensure to determine aligned regions properly
There were missing checks in the aligned region code, copy-paste errors
(= usage of the IsReachedFromAlignedBarrierOnly value instead of
IsReachingAlignedBarrierOnly value on the forward pass), and a missing
update of the call state for sync declarations and definitions.

Partially fixes https://github.com/llvm/llvm-project/issues/60425
2023-02-02 02:28:10 -08:00
Johannes Doerfert
18a2975b57 [Attributor][FIX] Ensure we use the right AAExecutionDomain
Before we might have ended up queriying the AAExecutionDomain of a
different function, which resulted in wrong optimistic results.

Partially fixes https://github.com/llvm/llvm-project/issues/60425
2023-02-02 02:27:54 -08:00
Guillaume Chatelet
ffc1205bde [reland][NFC] Transition GlobalObject alignment from MaybeAlign to Align
This is a follow up on https://reviews.llvm.org/D142459#4081179.
This first patch adds an overload to `GlobalObject::setAlignment` that accepts an `Align` type.
This already handles most of the calls.

This patch also converts a few call sites to the new type when this is safe.

Here is the list of the remaining call sites:

 - [clang/lib/CodeGen/CodeGenModule.cpp:1688](e195e6bad6/clang/lib/CodeGen/CodeGenModule.cpp (L1688))
 - [llvm/lib/AsmParser/LLParser.cpp:1309](e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L1309))
 - [llvm/lib/AsmParser/LLParser.cpp:6050](e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L6050))
 - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:3871](e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L3871))
 - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:4030](e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L4030))
 - [llvm/lib/IR/Core.cpp:2018](e195e6bad6/llvm/lib/IR/Core.cpp (L2018))
 - [llvm/lib/IR/Globals.cpp:141](e195e6bad6/llvm/lib/IR/Globals.cpp (L141))
 - [llvm/lib/Linker/IRMover.cpp:660](e195e6bad6/llvm/lib/Linker/IRMover.cpp (L660))
 - [llvm/lib/Linker/LinkModules.cpp:361](e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L361))
 - [llvm/lib/Linker/LinkModules.cpp:362](e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L362))
 - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:782](e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L782))
 - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:840](e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L840))
 - [llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1813](e195e6bad6/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (L1813))
 - [llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp:27](e195e6bad6/llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp (L27))

Differential Revision: https://reviews.llvm.org/D142708
2023-01-31 15:09:10 +00:00
Guillaume Chatelet
e098ee726e Revert D142708 "[NFC] Transition GlobalObject alignment from MaybeAlign to Align"
This is breaking the build bots. e.g.,
https://lab.llvm.org/buildbot/#/builders/121/builds/27549

This reverts commit 6717efe74da825214cb4d307ad35e5fbda353301.
2023-01-31 14:12:51 +00:00
Guillaume Chatelet
6717efe74d [NFC] Transition GlobalObject alignment from MaybeAlign to Align
This is a follow up on https://reviews.llvm.org/D142459#4081179.
This first patch adds an overload to `GlobalObject::setAlignment` that accepts an `Align` type.
This already handles most of the calls.

This patch also converts a few call sites to the new type when this is safe.

Here is the list of the remaining call sites:

 - [clang/lib/CodeGen/CodeGenModule.cpp:1688](e195e6bad6/clang/lib/CodeGen/CodeGenModule.cpp (L1688))
 - [llvm/lib/AsmParser/LLParser.cpp:1309](e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L1309))
 - [llvm/lib/AsmParser/LLParser.cpp:6050](e195e6bad6/llvm/lib/AsmParser/LLParser.cpp (L6050))
 - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:3871](e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L3871))
 - [llvm/lib/Bitcode/Reader/BitcodeReader.cpp:4030](e195e6bad6/llvm/lib/Bitcode/Reader/BitcodeReader.cpp (L4030))
 - [llvm/lib/IR/Core.cpp:2018](e195e6bad6/llvm/lib/IR/Core.cpp (L2018))
 - [llvm/lib/IR/Globals.cpp:141](e195e6bad6/llvm/lib/IR/Globals.cpp (L141))
 - [llvm/lib/Linker/IRMover.cpp:660](e195e6bad6/llvm/lib/Linker/IRMover.cpp (L660))
 - [llvm/lib/Linker/LinkModules.cpp:361](e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L361))
 - [llvm/lib/Linker/LinkModules.cpp:362](e195e6bad6/llvm/lib/Linker/LinkModules.cpp (L362))
 - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:782](e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L782))
 - [llvm/lib/Transforms/IPO/MergeFunctions.cpp:840](e195e6bad6/llvm/lib/Transforms/IPO/MergeFunctions.cpp (L840))
 - [llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1813](e195e6bad6/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (L1813))
 - [llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp:27](e195e6bad6/llvm/tools/llvm-reduce/deltas/ReduceGlobalObjects.cpp (L27))

Differential Revision: https://reviews.llvm.org/D142708
2023-01-31 13:59:58 +00:00
Joseph Huber
0bdde9dfb9 [OpenMP] Make OpenMPOpt aware of the OpenMP runtime's status
The `OpenMPOpt` pass contains optimizations that generate new calls into
the OpenMP runtime. This causes problems if we are in a state where the
runtime has already been linked statically. Generating these new calls
will result in them never being resolved. We should indicate if we are
in a "post-link" LTO phase and prevent OpenMPOpt from generating new
runtime calls.

Generally, it's not desireable for passes to maintain state about the
context in which they're called. But this is the only reasonable
solution to static linking when we have a pass that generates new
runtime calls.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142646
2023-01-26 13:23:44 -06:00
Shilei Tian
e13179db7a [NFC] clang-format OpenMPOpt.cpp 2023-01-25 19:05:53 -05:00
Johannes Doerfert
5238df7ed5 [Attributor] Allow (inter-procedural) "CFG" reasoning for aligned regions
If an instruction is executed in an aligned region we can ignore
threading effects and use CFG reasoning (dominance and reachability).
This is true because all threads are together in an aligned region and
there cannot be one waiting for a signal at a place not connected via
the control flow.

More dedicated tests will follow.

More details can be found here:
"Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero
Overhead Execution", IPDPS 2022,
https://www.osti.gov/servlets/purl/1890094
2023-01-23 22:45:48 -08:00
Johannes Doerfert
d41f93dae9 [OpenMP] Readnone calls do not have non-local side-effects 2023-01-23 22:45:47 -08:00
Kazu Hirata
7d3306fa42 [llvm] Fix warnings
This patch fixes:

  llvm/lib/IR/DataLayout.cpp:942:13: warning: unused variable ‘VecTy’
  [-Wunused-variable]

  llvm/lib/Transforms/IPO/OpenMPOpt.cpp:2899:27: warning: unused
  variable ‘MI’ [-Wunused-variable]
2023-01-23 10:57:56 -08:00
Johannes Doerfert
129faec711 [OpenMP] Identify non-aligned barriers executed in an aligned context
Even if a barrier does not enforce aligned execution, it will
effectively be like an aligned barrier if it is executed by all threads
in an aligned way. We lack control flow divergence analysis here so we
can only do (basic block) local reasoning for now.
2023-01-22 21:42:07 -08:00
Johannes Doerfert
8e124515cd [OpenMP][FIX] Ensure not to dereference a nullptr 2023-01-22 20:06:23 -08:00