199 Commits

Author SHA1 Message Date
Zhen Wang
51937fc996
Revert "[flang][OpenMP] Use cuf.alloc for privatization of CUDA Fortr… (#186891)
…an device arrays (#185984)"

This reverts commit fb18d570b0466ca2a401aba11d6e58b206aebc1a.

This PR caused compilation failures with allocatable arrays, reverting
now for more investigation.
2026-03-16 21:51:07 +00:00
Zhen Wang
fb18d570b0
[flang][OpenMP] Use cuf.alloc for privatization of CUDA Fortran device arrays (#185984)
When CUDA Fortran device arrays are listed in an OpenMP private clause,
the compiler previously allocated private copies on the host heap using
fir.allocmem. This caused device-side operations to receive host
pointers instead of device pointers, leading to cudaErrorIllegalAddress
(700).

Fix by detecting symbols with a CUDA data attribute (device, managed,
unified, etc.) during privatization and using cuf.alloc / cuf.free
instead of fir.allocmem / fir.freemem, so the private copies reside in
device memory.
2026-03-13 16:18:14 +00:00
Tom Eccles
54b4bd510a
[flang][AliasAnalysis] Cray pointers/pointees might alias with anything (#170900)
The LOC intrinsic allows a cray pointer to alias with ordinary variables
with no other attribute. See the new test for an example.

This is not enabled by default. The functionality can be used with
`-mmlir -funsafe-cray-pointers`.

First part of the un-revert of #169544. That will handle TBAA.
2025-12-10 15:32:27 +00:00
Abid Qadeer
cfc56c982f
[flang][debug] Track dummy argument positions explicitly. (#167489)
CHARACTER dummy arguments were treated as local variables in debug info.
This happened because our method to get the argument number was not
robust. It relied on `DeclareOp` having a direct reference to arguments
which was not the case for character arguments. This is fixed by storing
source-level argument positions in `DeclareOp`.

Fixes #112886
2025-11-12 10:21:32 +00:00
jeanPerier
c9fb37c75f
[flang][FIR] add fir.assumed_size_extent to abstract assumed-size extent encoding (#164452)
The purpose of this patch is to allow converting FIR array representation to
memref when possible without hitting memref verifier issue.

The issue was that FIR arrays may be assumed size, in which case the
last dimension will not be known at runtime. Flang uses -1 to encode
this to fulfill Fortran 2023 standard requirements in 18.5.3 point 5
about CFI_desc_t.

When arrays are converted to memeref, if this `-1` reaches memeref
operations, it triggers verifier errors (even if the conversion happened
in code that guards the code to be entered at runtime if the array is
assumed-size because folders/verifiers do not take into account
reachability).

This follows-up on discussions in #163505 merge requests
2025-10-22 11:46:18 +02:00
Eugene Epshteyn
832a342328
[flang] CDEFINED globals should have external linkage (#160167)
In Fortran::lower::defineGlobal() don't change the linkage and don't 
generate initializer for CDEFINED globals.
2025-09-25 10:26:45 -04:00
Carlos Seo
170c0c5225
[Flang] Handle unused entry dummies before processing shape (#157732)
Check for unused entry dummy arrays with BaseBoxType that calls
genUnusedEntryPointBox() before processing array shapes.

Fixes #132648
2025-09-17 10:52:33 -03:00
Valentin Clement (バレンタイン クレメン)
9fdf2c7105
[flang][cuda] Call runtime initialize for derived type with device components (#157914) 2025-09-10 18:19:56 +00:00
Valentin Clement (バレンタイン クレメン)
3e9802159b
[flang][cuda] Remove set_allocator_idx operation (#157747)
The allocator index is set from the component genre #157731 . There is
no more need of an operation to set it at a later point.
2025-09-09 14:20:19 -07:00
Leandro Lupori
3dc314b851
[flang] Fix lowering of unused dummy procedure pointers (#155649)
Fixes #126453
2025-09-08 08:39:07 -03:00
Slava Zakharin
83da8d08ff
[flang] Attach proper storage to [hl]fir.declare in lowering. (#155742)
As described in
https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026,
`[hl]fir.declare` should carry information about the layout
of COMMON/EQUIVALENCE variables within the physical storage.

This patch modifes Flang lowering to attach this information.
2025-09-04 15:49:11 -07:00
Eugene Epshteyn
4b6a4aa522
[flang] Consolidate copy-in/copy-out determination in evaluate framework (#151408)
New implementation of `MayNeedCopy()` is used to consolidate
copy-in/copy-out checks.

`IsAssumedShape()` and `IsAssumedRank()` were simplified and are both
now in `Fortran::semantics` workspace.

`preparePresentUserCallActualArgument()` in lowering was modified to use
`MayNeedCopyInOut()`

Fixes https://github.com/llvm/llvm-project/issues/138471
2025-08-26 18:40:13 -04:00
Valentin Clement (バレンタイン クレメン)
eb0ddba26b
Reland "[flang][cuda] Set the allocator of derived type component after allocation" (#152418)
Reviewed in #152379
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
2025-08-06 21:49:55 -07:00
Valentin Clement (バレンタイン クレメン)
7d3134f6cc
Revert "[flang][cuda] Set the allocator of derived type component after allocation" (#152402)
Reverts llvm/llvm-project#152379

Buildbot failure
https://lab.llvm.org/buildbot/#/builders/207/builds/4905
2025-08-06 15:55:53 -07:00
Valentin Clement (バレンタイン クレメン)
d897355876
[flang][cuda] Set the allocator of derived type component after allocation (#152379)
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
2025-08-06 15:14:00 -07:00
Valentin Clement (バレンタイン クレメン)
9b195dc3ef
[flang][cuda] Generate cuf.allocate for descriptor with CUDA components (#152041)
The descriptor for derived-type with CUDA components are allocated in
managed memory. The lowering was calling the standard runtime on
allocate statement where it should be a `cuf.allocate` operation.
2025-08-04 16:51:11 -07:00
Valentin Clement (バレンタイン クレメン)
05b52ef909
[flang][cuda][NFC] Update to the new create APIs (#152050)
Some operation creations were updated in flang directory but not all.
Migrate the CUF ops to the new create APIs introduce in #147168
2025-08-04 16:09:24 -07:00
Maksim Levental
a3a007ad5f
[mlir][NFC] update flang/Lower create APIs (8/n) (#149912)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-21 19:54:29 -04:00
Valentin Clement (バレンタイン クレメン)
6c63316ee1
[flang][cuda] Support device component in a pointer or allocatable derived-type (#149418) 2025-07-18 04:17:15 -07:00
Kazu Hirata
2a7328daca
[flang] Migrate away from ArrayRef(std::nullopt_t) (#149337)
ArrayRef(std::nullopt_t) has been deprecated.  This patch replaces
std::nullopt with {}.

A subsequence patch will address those places where we need to replace
std::nullopt with mlir::TypeRange{} or mlir::ValueRange{}.
2025-07-17 15:23:55 -07:00
Valentin Clement (バレンタイン クレメン)
8349bbd0b9
[flang][cuda] Exit early when there is no device components (#149005)
- Exit early when there is no device components
- Make the retrieval of the record type more robust
2025-07-16 10:08:11 -07:00
Valentin Clement (バレンタイン クレメン)
b4e2272271
[flang][cuda] Move cuf.set_allocator_idx after derived-type init (#148936)
Derived type initialization overwrite the component descriptor. Place
the `cuf.set_allocator_idx` after the initialization is performed.
2025-07-15 13:52:00 -07:00
Kazu Hirata
769bd90f8b [flang] Fix a warning
This patch fixes:

  flang/lib/Lower/ConvertVariable.cpp:787:22: error: unused variable
  'fieldTy' [-Werror,-Wunused-variable]
2025-07-14 22:17:12 -07:00
Valentin Clement (バレンタイン クレメン)
90ef114a33
[flang][cuda] Add cuf.set_allocator_idx for device component (#148750) 2025-07-14 19:31:44 -07:00
Valentin Clement (バレンタイン クレメン)
659c8102f4
Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory (#147416) 2025-07-07 17:40:04 -07:00
Valentin Clement
a5350785db Revert "[flang][cuda] Allocate derived-type with CUDA componement in managed memory (#146797)"
This reverts commit 925588cd001a91d592b99e6e7c6bee9514f5a26e.
2025-07-02 17:51:45 -07:00
Valentin Clement (バレンタイン クレメン)
925588cd00
[flang][cuda] Allocate derived-type with CUDA componement in managed memory (#146797)
Similarly to descriptor for device data, put derived type holding device
descriptor in managed memory.
2025-07-02 16:02:08 -07:00
jeanPerier
faefe7cf7d
[flang] add option to generate runtime type info as external (#146071)
Reland #145901 with a fix for shared library builds.

So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-30 09:58:00 +02:00
jeanPerier
37e2d10499
Revert "[flang] add option to generate runtime type info as external" (#146064)
Reverts llvm/llvm-project#145901

Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
2025-06-27 14:05:59 +02:00
jeanPerier
e816817bbb
[flang] add option to generate runtime type info as external (#145901)
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-27 13:00:29 +02:00
Kazu Hirata
938cdb30f1
[flang] Migrate away from std::nullopt (NFC) (#145928)
ArrayRef has a constructor that accepts std::nullopt.  This
constructor dates back to the days when we still had llvm::Optional.

Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.

This patch replaces std::nullopt with {}.  There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
2025-06-26 12:41:49 -07:00
jeanPerier
13daf65656
[flang] handle common block used as BIND(C) module variables (#145669)
Support odd case where a static object is being declared both as a
common block and a BIND(C) module variable name in different modules,
and both modules are used in the same compilation unit.

This is not standard, but happens when using MPI and MPI_F08 in the same
compilation unit, and at least both gfortran and ifx support this.

See added test case for an illustration.
2025-06-26 12:00:23 +02:00
Valentin Clement (バレンタイン クレメン)
d6d52a4abc
[flang][cuda] Do not generate cuf.alloc/cuf.free in device context (#141117)
`cuf.alloc` and `cuf.free` are converted to `fir.alloca` or deleted when
in device context during the CUFOpConversion pass. Do not generate them
in lowering to avoid confusion.
2025-05-22 13:30:34 -07:00
Tom Eccles
75e5643abf
[flang][OpenMP] share global variable initialization code (#138672)
Fixes #108136

In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.

Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.

Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
2025-05-07 10:18:13 +01:00
Slava Zakharin
9aff19e7a3
[flang] Defined SafeTempArrayCopyAttrInterface for array repacking. (#134346)
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
2025-04-10 18:41:54 -07:00
Slava Zakharin
3f6ae3f0a8
[flang] Added driver options for arrays repacking. (#134002)
Added options:
  * -f[no-]repack-arrays
  * -f[no-]stack-repack-arrays
  * -frepack-arrays-contiguity=whole/innermost
2025-04-03 10:43:28 -07:00
Slava Zakharin
2c91f10362
[flang] Fixed repacking for TARGET and INTENT(OUT) (#131972)
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.

In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.
2025-03-19 17:12:32 -07:00
Slava Zakharin
fd0e20a64b
[flang] Generate fir.pack/unpack_array in Lowering. (#131704)
Basic generation of array repacking operations in Lowering.
2025-03-18 21:26:33 -07:00
Valentin Clement (バレンタイン クレメン)
4fde8c341f
[flang][cuda] Lower CUDA shared variable with cuf.shared_memory op (#131399)
Use `cuf.shared_memory` operation instead of `cuf.alloc` for CUDA shared
variable. These variables do not need free operations.
2025-03-16 17:44:56 -07:00
jeanPerier
3ff3b29dd6
[flang] lower remaining cases of pointer assignments inside forall (#130772)
Implement handling of `NULL()` RHS, polymorphic pointers, as well as
lower bounds or bounds remapping in pointer assignment inside FORALL.

These cases eventually do not require updating hlfir.region_assign,
lowering can simply prepare the new descriptor for the LHS inside the
RHS region.

Looking more closely at the polymorphic cases, there is not need to call
the runtime, fir.rebox and fir.embox do handle the dynamic type setting
correctly.

After this patch, the last remaining TODO is the allocatable assignment
inside FORALL, which like some cases here, is more likely an accidental
feature given FORALL was deprecated in F2003 at the same time than
allocatable components where added.
2025-03-14 10:51:46 +01:00
jeanPerier
356bf3fa2d
Reland " [flang] Rely on global initialization for simpler derived types" (#130290)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.

Note: this relands #114002 with the fix for the LLVM timeout regressions that have been seen. The fix is to use the added fir.copy to avoid aggregate load/store.

Co-authored-by: NimishMishra <42909663+NimishMishra@users.noreply.github.com>
2025-03-11 15:19:43 +01:00
Leandro Lupori
29f5d5bea9
[flang][OpenMP] Fix privatization of procedure pointers (#130336)
Fixes #121720
2025-03-11 09:38:40 -03:00
Tom Eccles
d31a7dde48
Revert " [flang] Rely on global initialization for simpler derived types" (#130278)
Reverts llvm/llvm-project#114002

This causes a regression building cam4_r from spec2017
2025-03-07 13:59:29 +00:00
Valentin Clement (バレンタイン クレメン)
2130285564
[flang][cuda] Make sure allocator id is set for pointer allocate (#129950) 2025-03-05 17:29:09 -08:00
NimishMishra
0ae1f0a310
[flang] Rely on global initialization for simpler derived types (#114002)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
2025-03-05 05:44:51 -08:00
Valentin Clement (バレンタイン クレメン)
f3000d7d27
[flang][cuda] Do not trigger automatic deallocation in main (#128789)
Similar to host flow, do not trigger automatic deallocation at then end
of the main program since anything could happen like a
cudaDevcieReset().
2025-02-25 17:25:04 -08:00
jeanPerier
22d9726593
[flang] do not finalize or initialize unused entry dummy (#125482)
Dummy arguments from other entry statement that are not live in the current entry have no backing storage, user code referring to them is not allowed to be reached. The compiler was generating initialization/destruction code for them when INTENT(OUT), causing undefined behaviors.
2025-02-03 18:09:01 +01:00
Kiran Chandramohan
ce32625966
Reland "[Flang][Driver] Add a flag to control zero initialization" (#123606)
Reverts llvm/llvm-project#123330
2025-01-21 07:57:44 +00:00
Kiran Chandramohan
8a229f595a
Revert "Revert "Revert "[Flang][Driver] Add a flag to control zero initializa…" (#123330)
Reverts llvm/llvm-project#123097

Reverting due to buildbot failure
https://lab.llvm.org/buildbot/#/builders/89/builds/14577.
2025-01-17 12:27:58 +00:00
Kiran Chandramohan
8c63648117
Revert "Revert "[Flang][Driver] Add a flag to control zero initializa… (#123097)
…tion of global v…" (#123067)"

This reverts commit 44ba43aa2b740878d83a9d6f1d52a333c0d48c22.

Adds the flag to bbc as well.
2025-01-17 12:14:20 +00:00