Attempt to address the following example from causing an assert or ICE:
```
subroutine test(a)
implicit none
integer :: i
real(kind=real64), dimension(:) :: a
real(kind=real64), dimension(size(a, 1)) :: b
!$omp target map(tofrom: b)
do i = 1, 10
b(i) = i
end do
!$omp end target
end subroutine
```
Where we utilise a Fortran intrinsic (size) to calculate the size of
allocatable arrays and then map it to device.
This re-applies #117867 with a small fix that hopefully prevents build
bot failures. The fix is avoiding `dyn_cast` for the result of
`getOperation()`. Instead we can assign the result to `mlir::ModuleOp`
directly since the type of the operation is known statically (`OpT` in
`OperationPass`).
This is a starting PR to implicitly map allocatable record fields.
This PR contains the following changes:
1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that
these utils work on the `mlir::Value` level rather than the
`semantics::Symbol` level. This takes one step towards to enabling
MLIR passes to more easily do some lowering themselves (e.g. creating
`omp.map.bounds` ops for implicitely caputured data like this PR
does).
2. Adds support for implicitely capturing and mapping allocatable fields
in record types.
There is quite some distant to still cover to have full support for
this. I added a number of todos to guide further development.
Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
This PR is one of 3 in a PR stack, this is the primary change set which
seeks to extend the current derived type explicit member mapping support
to handle descriptor member mapping at arbitrary levels of nesting. The
PR stack seems to do this reasonably (from testing so far) but as you
can create quite complex mappings with derived types (in particular when
adding allocatable derived types or arrays of allocatable derived types)
I imagine there will be hiccups, which I am more than happy to address.
There will also be further extensions to this work to handle the
implicit auto-magical mapping of descriptor members in derived types and
a few other changes planned for the future (with some ideas on
optimizing things).
The changes in this PR primarily occur in the OpenMP lowering and the
OMPMapInfoFinalization pass.
In the OpenMP lowering several utility functions were added or extended
to support the generation of appropriate intermediate member mappings
which are currently required when the parent (or multiple parents) of a
mapped member are descriptor types. We need to map the entirety of these
types or do a "deep copy" for lack of a better term, where we map both
the base address and the descriptor as without the copying of both of
these we lack the information in the case of the descriptor to access
the member or attach the pointers data to the pointer and in the latter
case we require the base address to map the chunk of data. Currently we
do not segment descriptor based derived types as we do with regular
non-descriptor derived types, we effectively map their entirety in all
cases at the moment, I hope to address this at some point in the future
as it adds a fair bit of a performance penalty to having nestings of
allocatable derived types as an example. The process of mapping all
intermediate descriptor members in a members path only occurs if a
member has an allocatable or object parent in its symbol path or the
member itself is a member or allocatable. This occurs in the
createParentSymAndGenIntermediateMaps function, which will also generate
the appropriate address for the allocatable member within the derived
type to use as a the varPtr field of the map (for intermediate
allocatable maps and final allocatable mappings). In this case it's
necessary as we can't utilise the usual Fortran::lower functionality
such as gatherDataOperandAddrAndBounds without causing issues later in
the lowering due to extra allocas being spawned which seem to affect the
pointer attachment (at least this is my current assumption, it results
in memory access errors on the device due to incorrect map information
generation). This is similar to why we do not use the MLIR value
generated for this and utilise the original symbol provided when mapping
descriptor types external to derived types. Hopefully this can be
rectified in the future so this function can be simplified and more
closely aligned to the other type mappings. We also make use of
fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version
doesn't support the appropriate lowering to FIR necessary at the moment,
we also cannot use a single CoordinateOp (similarly to a single GEP) as
when we index through a descriptor operation (BoxType) we encounter
issues later in the lowering, however in either case we need access to
intermediate descriptors so individual CoordinateOp's aid this
(although, being able to compress them into a smaller amount of
CoordinateOp's may simplify the IR and perhaps result in a better end
product, something to consider for the future).
The other large change area was in the OMPMapInfoFinalization pass,
where the pass had to be extended to support the expansion of box types
(or multiple nestings of box types) within derived types, or box type
derived types. This requires expanding each BoxType mapping from one
into two maps and then modifying all of the existing member indices of
the overarching parent mapping to account for the addition of these new
members alongside adjusting the existing member indices to support the
addition of these new maps which extend the original member indices (as
a base address of a box type is currently considered a member of the box
type at a position of 0 as when lowered to LLVM-IR it's a pointer
contained at this position in the descriptor type, however, this means
extending mapped children of this expanded descriptor type to
additionally incorporate the new member index in the correct location in
its own index list). I believe there is a reasonable amount of comments
that should aid in understanding this better, alongside the test
alterations for the pass.
A subset of the changes were also aimed at making some of the utilities
for packing and unpacking the DenseIntElementsAttr containing the member
indices shareable across the lowering and OMPMapInfoFinalization, this
required moving some functions to the Lower/Support/Utils.h header, and
transforming the lowering structure containing the member index data
into something more similar to the version used in
OMPMapInfoFinalization. There we also some other attempts at tidying
things up in relation to the member index data generation in the
lowering, some of which required creating a logical operator for the
OpenMP ID class so it can be utilised as a map key (it simply utilises
the symbol address for the moment as ordering isn't particularly
important).
Otherwise I have added a set of new tests encompassing some of the
mappings currently supported by this PR (unfortunately as you can have
arbitrary nestings of all shapes and types it's not very feasible to
cover them all).
This patch primarily updates the MapInfoFinalization pass to utilise the
BlockArgument interface. It also shuffles newly added arguments the
MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo
lists, instead of one prior to the owning descriptor type.
During this it was noted that the use_device_ptr/addr handling of target
data was a little bit too order dependent so I've attempted to make it
less so, as we cannot depend on argument ordering to be the same as
Fortran for any future frontends.
This patch updates the use_device_ptr and use_device_addr clauses to use
the mapInfoOps for lowering. This allows all the types that are handle
by the map clauses such as derived types to also be supported by the
use_device_clauses.
This is patch 1/2 in a series of patches.
Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
This PR aims to unify the map argument generation behavior across both
the implicit capture (captured in a target region) and the explicit
capture (process map), currently the varPtr field of the MapInfo for the
same variable will be different depending on how it's captured. This PR
tries to align that across the generations of MapInfoOp in the OpenMP
lowering.
Currently, I have opted to utilise the rawInput (input memref to a HLFIR
DeclareInfoOp) as opposed to the addr field which includes more
information. The side affect of this is that we have to deal with
BoxTypes less often, which will result in simpler maps in these cases.
The negative side affect of this is that we don't have access to the
bounds information through the resulting value, however, I believe the
bounds information we require in our case is still appropriately stored
in the map bounds, and this seems to be the case from testing so far.
The other fix is for cases where we end up with a BoxType argument into
a function (certain assumed shape and sizes cases do this) that has no
fir.ref wrapping it. As we need the Box to be a reference type to
actually utilise the operation to access the base address stored inside
and create the correct mappings we currently generate an intermediate
allocation in these cases, and then store into it, and utilise this as
the map argument, as opposed to the original.
However, as we were not sharing the same intermediate allocation across
all of the maps for a variable, this resulted in errors in certain cases
when detatching/attatching the data e.g. via enter and exit. This PR
adjusts this for cases
Currently we only maintain tracking of all intermediate allocations for
the current function scope, as opposed to module. Primarily as the only
case I am aware of that this is required is in cases where we pass
certain types of arguments to functions (so I opted to minimize the
overhead of the pass for now). It could likely be extended to module
scope if required if we find other cases where it's applicable and
causing issues.