Add lowering support for array with dynamic extents in the firstprivate
recipe. Generalize the lowering so static shaped arrays and array with
dynamic extents use the same path.
Some cleaning code is taken from #68836 that is not landed yet.
Add support for assumed shape arrays in lowering of the copy region of
the firstprivate recipe. Information is passed in block arguments as it
is done for the reduction recipe.
Add lowering support for array with dynamic extents for private recipe.
The extents are passed as block arguments and used in the alloca
operation. The shape also used this information for the hlfir.declare
operation.
This patch updates the lowering of OpenACC routine directive to avoid
creating duplicate acc.routine operations when all the clauses are
identical. If clauses differ an error is raised.
`getBoundsString` is used to generate the reduction recipe names when an
array section is provided. The lowerbound and upperbound were swapped.
This patch fixes it.
Following #67719, propagate the constant bounds in the combiner region
when all bounds are constant. Otherwise, bounds information are
propagated as block arguments as defined in #67719.
This patch makes use of the bounds in the combiner region for known
shape arrays. Until know the combiner region was iterating over the
whole array.
Lowerbound, upperbound and step are passed as block arguments after the
two values.
A follow up patch will make use of this information for the assumed
shape arrays as well.
Assumed shape array are using descriptor and must be handled differently
than known shape arrays. This patch adds support to generate the `init`
and `combiner` region for the reduction recipe operation with assumed
shape array by using the descriptor and the HLFIR lowering path.
`createTempFromMold` function is moved from
`flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp` to
`flang/include/flang/Optimizer/Builder/HLFIRTools.h` to be reused to
create the private copy.
Following #66099, the generation of private (and firstprivate) recipe
needs to add a declare op. This patch adds the declare op for the case
currently supported.
This will fix issue #66105.
This patch builds on top of a prior patch in review which adds a new map
and bounds operation by modifying the OpenMP PFT lowering to support
these operations and generate them from the PFT.
A significant amount of the support for the Bounds operation is borrowed
from OpenACC's own current implementation and lowering, just ported
over to OpenMP.
The patch also adds very preliminary/initial support for lowering to
a new Capture attribute, which is stored on the new Map Operation,
which helps the later lowering from OpenMP -> LLVM IR by indicating
how a map argument should be handled. This capture type will
influence how a map argument is accessed on device and passed by
the host (different load/store handling etc.). It is reflective of a
similar piece of information stored in the Clang AST which performs a
similar role.
As well as some minor adjustments to how the map type (map bitshift
which dictates to the runtime how it should handle an argument) is
generated to further support more use-cases for future patches that
build on this work.
Finally it adds the map entry operation creation and tying it to the relevant
target operations as well as the addition of some new tests and alteration
of previous tests to support the new changes.
Depends on D158732
reviewers: kiranchandramohan, TIFitis, clementval, razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158734
For unstructured construct, the blocks are created in advance inside the
function body. This causes issues when the unstructured construct is
inside an OpenACC region operations. This patch adds the same fix than
OpenMP lowering and re-create the blocks inside the op region.
Initial OpenMP fix: 29f167abcf7d871d17dd3f38f361916de1a12470
Since the OpenACC atomics specification is a subset of OpenMP atomics,
the same lowering implementation can be used. This change extracts out
the necessary pieces from the OpenMP lowering and puts them in a shared
spot. The shared spot is a header file so that each implementation can
template specialize directly.
After putting the OpenMP implementation in a common spot, the following
changes were needed to make it work for OpenACC:
* Ensure parsing works correctly by avoiding hardcoded offsets.
* Templatize based on atomic type.
* The checking whether it is OpenMP or OpenACC is done by checking for
OmpAtomicClauseList (OpenACC does not implement this so we just
templatize with void). It was preferable to check this instead of atomic
type because in some cases, like atomic capture, the read/write/update
implementations are called - and we want compile time evaluation of
these conditional parts.
* The memory order and hint are used only for OpenMP.
* Generate acc dialect operations instead of omp dialect operations.
The cache directive is attached directly to the acc.loop operation when
the directive appears in the loop. When it appears before a loop, the
OpenACCCacheConstruct is saved and attached when the acc.loop is
created.
Directive that cannot be attached to a loop are silently discarded.
Depends on #65521
The `cache` directive may appear at the top of (inside of) a loop. It
specifies array elements or subarrays that should be fetched into the
highest level of the cache for the body of the loop.
The `cache` directive is modeled as a data entry operands attached to
the acc.loop operation.
Some compilers accept `!$acc data` without any clauses. For portability
reason, this patch relaxes the strict error to a simple portability warning.
Reviewed By: razvanlupusoru, vzakhari
Differential Revision: https://reviews.llvm.org/D159019
This reverts commit 02fa9fc018db5b757a4ce129d85d64efefc8645c.
Commit message and content does not match. Revert to commit with
a proper commit message.
getSymbolFromAccObject was hitting the fatal error when
trying to retrieve the symbol on array section
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158881
This patch propagates the acc routine information
to the module file so they can be used by the caller.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158541
Lower the acc delcare directive in function/subroutine
to the newly introduced acc.declare operation. Only a single
acc.declare operation is procduced in a function or subroutine
so they don't end up nested.
Depends on D158314
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158315
The routine directive can appear in the specification part of
a subroutine, function or module and therefore appear before the
function or subroutine is lowered. We keep track of the created
routine info attribute and attach them to the function at the end
of the lowering if the directive appeared before the function was
lowered.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158204
This patch makes use of the HLFIR box produced for hlfir.declare
in place of the FIR box (the memref of hlfir.declare) when possible.
This makes the representation a little bit more clear, because
all accesses are made via a single box.
This reduces the life range of the original box, because the new
temporary box produced by embox/rebox is used from now.
Apparently, this works around some issues in the current HLFIR codegen,
for example, look at the LIT tests changes around fir.array_coor
produced by hlfir.designate codegen - using the FIR box for fir.array_coor
might result in using incorrect lbounds.
Apparently, this change enables more intrinsics simplifications
because the SimplifyIntrinsicsPass looks for explicit embox/rebox
in findBoxDef() to decide whether to apply the optimization.
This change also provides better association of the base addresses
referenced by OpenACC clauses with the corresponding boxes
that might be used explicitly in OpenACC regions (e.g. for reading
the lbounds).
Reviewed By: razvanlupusoru, clementval
Differential Revision: https://reviews.llvm.org/D158119
Lower the bind clause to the corresponding attribute
Depends on D158120
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158121
Not all declare clause have an exit operation attach to them and
therefore no dealloc function generated. Attach
the pre/post deallocation attribute only for the clauses that have
an exit operation.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158106
Lowering was missing to generate the pre/post alloc/dealloc
functions for the acc declare variables. This patch adds the generation.
These functions have the descriptor as their unique argument.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158103
This patch lower simple acc routine directive
with no clauses and no name inside function/subroutine.
Patch to handle name and clauses will follow up.
Patch to add attribute to the original routine will follow as well.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157919
This patches adds the acc.declare_action attrbites on
post allocate operation and pre/post deallocate operations.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157915