166 Commits

Author SHA1 Message Date
Tom Eccles
75e5643abf
[flang][OpenMP] share global variable initialization code (#138672)
Fixes #108136

In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.

Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.

Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
2025-05-07 10:18:13 +01:00
Slava Zakharin
9aff19e7a3
[flang] Defined SafeTempArrayCopyAttrInterface for array repacking. (#134346)
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
2025-04-10 18:41:54 -07:00
Slava Zakharin
3f6ae3f0a8
[flang] Added driver options for arrays repacking. (#134002)
Added options:
  * -f[no-]repack-arrays
  * -f[no-]stack-repack-arrays
  * -frepack-arrays-contiguity=whole/innermost
2025-04-03 10:43:28 -07:00
Slava Zakharin
2c91f10362
[flang] Fixed repacking for TARGET and INTENT(OUT) (#131972)
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.

In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.
2025-03-19 17:12:32 -07:00
Slava Zakharin
fd0e20a64b
[flang] Generate fir.pack/unpack_array in Lowering. (#131704)
Basic generation of array repacking operations in Lowering.
2025-03-18 21:26:33 -07:00
Valentin Clement (バレンタイン クレメン)
4fde8c341f
[flang][cuda] Lower CUDA shared variable with cuf.shared_memory op (#131399)
Use `cuf.shared_memory` operation instead of `cuf.alloc` for CUDA shared
variable. These variables do not need free operations.
2025-03-16 17:44:56 -07:00
jeanPerier
3ff3b29dd6
[flang] lower remaining cases of pointer assignments inside forall (#130772)
Implement handling of `NULL()` RHS, polymorphic pointers, as well as
lower bounds or bounds remapping in pointer assignment inside FORALL.

These cases eventually do not require updating hlfir.region_assign,
lowering can simply prepare the new descriptor for the LHS inside the
RHS region.

Looking more closely at the polymorphic cases, there is not need to call
the runtime, fir.rebox and fir.embox do handle the dynamic type setting
correctly.

After this patch, the last remaining TODO is the allocatable assignment
inside FORALL, which like some cases here, is more likely an accidental
feature given FORALL was deprecated in F2003 at the same time than
allocatable components where added.
2025-03-14 10:51:46 +01:00
jeanPerier
356bf3fa2d
Reland " [flang] Rely on global initialization for simpler derived types" (#130290)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.

Note: this relands #114002 with the fix for the LLVM timeout regressions that have been seen. The fix is to use the added fir.copy to avoid aggregate load/store.

Co-authored-by: NimishMishra <42909663+NimishMishra@users.noreply.github.com>
2025-03-11 15:19:43 +01:00
Leandro Lupori
29f5d5bea9
[flang][OpenMP] Fix privatization of procedure pointers (#130336)
Fixes #121720
2025-03-11 09:38:40 -03:00
Tom Eccles
d31a7dde48
Revert " [flang] Rely on global initialization for simpler derived types" (#130278)
Reverts llvm/llvm-project#114002

This causes a regression building cam4_r from spec2017
2025-03-07 13:59:29 +00:00
Valentin Clement (バレンタイン クレメン)
2130285564
[flang][cuda] Make sure allocator id is set for pointer allocate (#129950) 2025-03-05 17:29:09 -08:00
NimishMishra
0ae1f0a310
[flang] Rely on global initialization for simpler derived types (#114002)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
2025-03-05 05:44:51 -08:00
Valentin Clement (バレンタイン クレメン)
f3000d7d27
[flang][cuda] Do not trigger automatic deallocation in main (#128789)
Similar to host flow, do not trigger automatic deallocation at then end
of the main program since anything could happen like a
cudaDevcieReset().
2025-02-25 17:25:04 -08:00
jeanPerier
22d9726593
[flang] do not finalize or initialize unused entry dummy (#125482)
Dummy arguments from other entry statement that are not live in the current entry have no backing storage, user code referring to them is not allowed to be reached. The compiler was generating initialization/destruction code for them when INTENT(OUT), causing undefined behaviors.
2025-02-03 18:09:01 +01:00
Kiran Chandramohan
ce32625966
Reland "[Flang][Driver] Add a flag to control zero initialization" (#123606)
Reverts llvm/llvm-project#123330
2025-01-21 07:57:44 +00:00
Kiran Chandramohan
8a229f595a
Revert "Revert "Revert "[Flang][Driver] Add a flag to control zero initializa…" (#123330)
Reverts llvm/llvm-project#123097

Reverting due to buildbot failure
https://lab.llvm.org/buildbot/#/builders/89/builds/14577.
2025-01-17 12:27:58 +00:00
Kiran Chandramohan
8c63648117
Revert "Revert "[Flang][Driver] Add a flag to control zero initializa… (#123097)
…tion of global v…" (#123067)"

This reverts commit 44ba43aa2b740878d83a9d6f1d52a333c0d48c22.

Adds the flag to bbc as well.
2025-01-17 12:14:20 +00:00
Kiran Chandramohan
44ba43aa2b
Revert "[Flang][Driver] Add a flag to control zero initialization of global v…" (#123067)
Reverts llvm/llvm-project#122144

Reverting due to CI failure
https://lab.llvm.org/buildbot/#/builders/89/builds/14422
2025-01-15 15:23:34 +00:00
Kiran Chandramohan
c593e3d0f7
[Flang][Driver] Add a flag to control zero initialization of global v… (#122144)
…ariables

Patch adds a flag to control zero initialization of global variables
without default initialization. The default is to zero initialize.
2025-01-15 15:06:57 +00:00
Leandro Lupori
1fcb6a9754
[flang][OpenMP] Initialize allocatable members of derived types (#120295)
Allocatable members of privatized derived types must be allocated,
with the same bounds as the original object, whenever that member
is also allocated in it, but Flang was not performing such
initialization.

The `Initialize` runtime function can't perform this task unless
its signature is changed to receive an additional parameter, the
original object, that is needed to find out which allocatable
members, with their bounds, must also be allocated in the clone.
As `Initialize` is used not only for privatization, sometimes this
other object won't even exist, so this new parameter would need
to be optional.
Because of this, it seemed better to add a new runtime function:
`InitializeClone`.
To avoid unnecessary calls, lowering inserts a call to it only for
privatized items that are derived types with allocatable members.

Fixes https://github.com/llvm/llvm-project/issues/114888
Fixes https://github.com/llvm/llvm-project/issues/114889
2024-12-19 17:26:50 -03:00
Michael Kruse
c91ba04328
[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
jeanPerier
cf602b95d1
[flang] handle fir.call in AliasAnalysis::getModRef (#117164)
fir.call side effects are hard to describe in a useful way using
`MemoryEffectOpInterface` because it is impossible to list which memory
location a user procedure read/write without doing a data flow analysis
of its body (even PURE procedures may read from any module variable,
Fortran SIMPLE procedure from F2023 will allow that, but they are far
from common at that point).

Fortran language specifications allow the compiler to deduce
that a procedure call cannot access a variable in many cases 
This patch leverages this to extend `fir::AliasAnalysis::getModRef` to
deal with fir.call.

This will allow implementing "array = array_function()" optimization in
a future patch.
2024-11-26 11:17:33 +01:00
jeanPerier
bb8bf858e8
[flang] add internal_assoc flag to mark variable captured in internal procedure (#117161)
This patch adds a flag to mark hlfir.declare of host variables that are
captured in some internal procedure.

It enables implementing a simple fir.call handling in
fir::AliasAnalysis::getModRef leveraging Fortran language specifications
and without a data flow analysis.

This will allow implementing an optimization for "array =
array_function()" where array storage is passed directly into the hidden
result argument to "array_function" when it can be proven that
arraY_function does not reference "array".

Captured host variables are very tricky because they may be accessed
indirectly in any calls if the internal procedure address was captured
via some global procedure pointer. Without flagging them, there is no
way around doing a complex inter procedural data flow analysis:
- checking that the call is not made to an internal procedure is not
enough because of the possibility of indirect calls made to internal
procedures inside the callee.
- checking that the current func.func has no internal procedure is not
enough because this would be invalid with inlining when an procedure
with internal procedures is inlined inside a procedure without internal
procedure.
2024-11-26 09:21:13 +01:00
Scott Manley
e6a4346b5a
[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770)
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.

Co-authored-by: Scott Manley <scmanley@nvidia.com>
2024-10-18 09:29:25 +02:00
jeanPerier
ccca3c6371
[flang] enable assumed-rank lowering by default (#110893)
Aside from a minor TODO about polymorphic RANK(*) (2b8e81ce91/flang/lib/Lower/Bridge.cpp (L3459)),
the implementation for assumed-rank is ready for everyone to use.
2024-10-04 09:04:56 +02:00
jeanPerier
c4204c0b29
[flang] replace fir.complex usages with mlir complex (#110850)
Core patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292.
After that, the last step is to remove fir.complex from FIR types.
2024-10-03 17:10:57 +02:00
Valentin Clement (バレンタイン クレメン)
0dcd68c28a
[flang][cuda] Set allocator index for module allocatable variable (#106777)
Descriptor for module variable with cuda attribute must be set with the
correct allocator index. This patch updates the embox operation used in
the global to carry the allocator index.
2024-08-30 14:55:09 -07:00
Valentin Clement (バレンタイン クレメン)
5bb379f6f0
[flang][cuda] Fix allocation of descriptor for cray pointer (#103474)
The cray pointee descriptor with device attribute was not allocated with
cuf.alloc so it leads to error on deallocation with cuf.free.
2024-08-13 16:52:00 -07:00
Valentin Clement (バレンタイン クレメン)
388b63243c
[flang][cuda] Defined allocator for unified data (#102189)
CUDA unified variable where set to use the same allocator than managed
variable. This patch adds a specific allocator for the unified
variables. Currently it will call the managed allocator underneath but
we want to have the flexibility to change that in the future.
2024-08-06 14:30:31 -07:00
Valentin Clement (バレンタイン クレメン)
bbdb1e400f
[flang][cuda] Set the allocator on fir.embox operation (#101722)
This patch set the `allocator_idx` attribute for allocatable descriptor
that have specific CUDA attribute.
2024-08-02 14:00:26 -07:00
Tom Eccles
98e733eaf2
[flang][OpenMP] Initialize privatised derived type variables (#100417)
Fixes #91928
2024-07-25 16:53:27 +01:00
Valentin Clement (バレンタイン クレメン)
33cb29cc3e
[flang][cuda] Use cuf.alloc/cuf.free for local descriptor (#98518)
Local descriptor for cuda allocatable need to be handled on host and
device. One solution is to duplicate the descriptor (one on the host and
one on the device) and keep them in sync or have the descriptor in
managed/unified memory so we don't to take care of any sync.
The second solution is probably the one we will implement. In order to
have more flexibility on how descriptor representing cuda allocatable
are allocated, this patch updates the lowering to use the cuf operations
alloc and free to managed them.
2024-07-17 13:52:36 -07:00
jeanPerier
8f90258a51
[flang] implement assumed-rank in ENTRY (#96111)
With `createUnallocatedBox` utility change from #96106 , the TODO for assumed-rank in entry
can simply be lifted and test is added.

The key is that a unallocated assumed-rank descriptor is created with
rank zero in the entry where an assumed-rank dummy from some other entry
do not appear as a dummy (the symbol must still be mapped to some valid
value because the symbol could be used in code that would be unreachable
at runtime, but that the compiler must still generate).
2024-06-20 15:11:09 +02:00
Valentin Clement (バレンタイン クレメン)
3a47d948ba
[flang][cuda] Propagate data attribute to global with initialization (#95504)
Global with initial value were missing the CUDA data attribute.
2024-06-14 10:49:18 -07:00
Valentin Clement (バレンタイン クレメン)
c1654c38e8
[flang] Carry over alignment computed by frontend for COMMON (#94280)
The frontend computes the necessary alignment for COMMON blocks but this
information is never carried over to the code generation and can lead to
segfault for COMMON block that requires a non default alignment.

This patch add an optional attribute on fir.global and carries over the
information.
2024-06-04 11:15:31 -07:00
jeanPerier
74faa402cc
[flang] lower allocatable assumed-rank specification parts (#93682)
Lower allocatable and pointers specification parts. Nothing special is
required to allocate the descriptor given they are required to be dummy
arguments, however, care must be taken with INTENT(OUT) to use the
runtime to deallocate them (inlined fir.embox + store is not possible).
2024-05-30 09:31:18 +02:00
jeanPerier
5aba0ded6c
[flang] lower assumed-rank variables specification expressions (#93477)
Enable lowering of assumed-ranks in specification parts under a debug
flag. I am using a debug flag because many cryptic TODOs/issues may be
hit until more support is added. The development should not take too
long, so I want to stay away from the noise of adding an actual
experimental flag to flang-new.
2024-05-29 10:18:22 +02:00
Valentin Clement (バレンタイン クレメン)
702198fc9a
[flang][cuda] Add data attribute to program globals (#92610) 2024-05-17 20:56:10 -07:00
Valentin Clement (バレンタイン クレメン)
45daa4fdc6
[flang][cuda] Move CUDA Fortran operations to a CUF dialect (#92317)
The number of operations dedicated to CUF grew and where all still in
FIR. In order to have a better organization, the CUF operations,
attributes and code is moved into their specific dialect and files. CUF
dialect is tightly coupled with HLFIR/FIR and their types.

The CUF attributes are bundled into their own library since some
HLFIR/FIR operations depend on them and the CUF dialect depends on the
FIR types. Without having the attributes into a separate library there
would be a dependency cycle.
2024-05-17 09:37:53 -07:00
Slava Zakharin
1710c8cf0f
[flang] Lowering changes for assigning dummy_scope to hlfir.declare. (#90989)
The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.

I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).

If this can be done in a cleaner way, please advise.
2024-05-08 16:48:14 -07:00
Valentin Clement (バレンタイン クレメン)
26060de063
[flang][cuda] Lower device/managed/unified allocation to cuda ops (#90623)
Lower locals allocation of cuda device, managed and unified variables to
fir.cuda_alloc. Add fir.cuda_free in the function context finalization.

@vzakhari For some reason the PR #90526 has been closed when I merged PR
#90525. Just reopening one.
2024-05-02 14:32:53 -07:00
Christian Sigg
bd9fdce69b
[flang] Use isa/dyn_cast/cast/... free functions. (#90432)
The corresponding member functions are deprecated.
2024-04-29 09:16:22 +02:00
Christian Sigg
fac349a169
Reapply "[mlir] Mark isa/dyn_cast/cast/... member functions depreca… (#90406)
…ted. (#89998)" (#90250)

This partially reverts commit 7aedd7dc754c74a49fe84ed2640e269c25414087.

This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
2024-04-28 22:01:42 +02:00
dyung
7aedd7dc75
Revert "[mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)" (#90250)
This reverts commit 950b7ce0b88318f9099e9a7c9817d224ebdc6337.

This change is causing build failures on a bot
https://lab.llvm.org/buildbot/#/builders/216/builds/38157
2024-04-26 12:09:13 -07:00
Christian Sigg
950b7ce0b8
[mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)
See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
2024-04-26 16:28:30 +02:00
Valentin Clement (バレンタイン クレメン)
7c0da7993e
[flang][cuda] Use fir.cuda_deallocate for automatic deallocation (#89662)
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.

This patch has also the side effect to correctly call
`attachDeclarePostDeallocAction` for OpenACC declare variable on
automatic deallocation as well. Update the code in
`attachDeclarePostDeallocAction` so we do not attach on fir.result but
on the correct last op.
2024-04-24 08:43:54 -07:00
jeanPerier
008b7f1dfd
[flang] implement capture of procedure pointers in internal procedures (#89619) 2024-04-24 09:21:56 +02:00
Valentin Clement
f35e1931be
Revert "[flang][cuda] Use fir.cuda_deallocate for automatic deallocation (#89450)"
This reverts commit 2a632d3d9f5c70db38c617b0816deb37ef722a7b.

This has some implication on OpenACC postDeallocate action
2024-04-19 17:25:47 -07:00
Valentin Clement (バレンタイン クレメン)
2a632d3d9f
[flang][cuda] Use fir.cuda_deallocate for automatic deallocation (#89450)
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.
2024-04-19 14:49:56 -07:00
jeanPerier
ad4e1aba3f
[flang] Pass VALUE CHARACTER arg by register in BIND(C) calls (#87774)
Fortran mandates "CHARACTER(1), VALUE" be passed as a C "char" in calls
to BIND(C) procedures (F'2023 18.3.7 (4)). Lowering passed them by
memory instead. Update call interface lowering code to pass them by
register. Fix related test and update it to use HLFIR.
2024-04-12 10:29:01 +02:00