142 Commits

Author SHA1 Message Date
parabola94
cc58ca5370
[flang/flang-rt] Add -isysroot flag only to tests really requiring (#152914)
-isysroot flag was added to all tests, but it makes
Driver/darwin-version.f90 failed.

In fact, only a few tests regarding interoperability with C need
-isysroot flag to search for headers and libraries. So, -isysroot flag
is now eliminated from the substitution `%flang`, and a new substitution
`%isysroot` has been introduced.

Moreover, Integration/iso-fortran-binding.cpp invokes clang++ via a
shell script, which makes it hard to add -isysroot flag. So, it is
refactored.

Fixes #150765
2025-08-13 21:43:53 +00:00
Peter Klausler
022bd53b88
[flang][runtime][NFC] Add a comment to intrinsic assignment (#153260)
Add a comment explaining why intrinsic derived type assignment
unconditionally deallocates all allocated allocatable subobject
components of the left-hand side variable, so that I won't forget the
reasoning here the next time this comes into question.
2025-08-13 14:38:24 -07:00
Peter Klausler
925db844cb
[flang][runtime] Handle NAN(...) in namelist input (#153101)
The various per-type functions for list-directed (including namelist)
input editing all call a common function to detect whether the next
token of input is the name of a namelist item. This check simply
determines whether this next token looks like an identifier followed by
'=', '(', or '%', and this fails when the next item of input is a NAN
with parenthesized stuff afterwards. Make the check smarter so that it
ensures that any upcoming possible identifier is actually the name of an
item in the namelist group. (And that's tricky too when the group has an
array item named "nan" and the upcoming input is "nan("; see the
newly-added unit test case.)

Fixes https://github.com/llvm/llvm-project/issues/152538.

more
2025-08-13 14:37:41 -07:00
David Truby
f73a3028c2
[flang-rt] Use correct flang-rt build for flang-rt unit tests on Windows (#152318)
Currrently flang-rt assumes that LLVM was always built with the dynamic
MSVC runtime. This may not be the case, if the user has specified a
different runtime with -DCMAKE_MSVC_RUNTIME_LIBRARY. Since this flag is
implied by -DLLVM_ENABLE_RPMALLOC=On, which is used by the Windows
release script, this is causing that script to fail.

Fixes #151920
2025-08-07 13:09:35 +01:00
Eugene Epshteyn
cae7bebcaa
[flang-rt] Runtime implementation of extended intrinsic function SECNDS() (#152021)
Until the compiler part is fully hooked up via
https://github.com/llvm/llvm-project/pull/151878, tested this using
`external`:
```
external secnds
real s1, s2
s1 = secnds(0.0)
print *, "Seconds from midnight:", s1
call sleep(2)
s2 = secnds(s1)
print *, "Seconds from s1", s2
print *, "Seconds from midnight:", secnds(0.0)
end
```
2025-08-06 16:02:27 -04:00
Peter Klausler
fc9a080780
[flang][runtime] Handle empty NAMELIST value list (#151770)
InputNamelist() returns early if any value list read in by
InputDerivedType() or DescriptorIo<Input>() is empty, since they return
false. But an empty value list is okay, and the early return should
occur only on error.

Fixes https://github.com/llvm/llvm-project/issues/151756.
2025-08-05 13:40:11 -07:00
Peter Klausler
56051daaf0
[flang][runtime] Optimize Descriptor::FixedStride() (#151755)
Put the common cases on fast paths, and don't depend on IsContiguous()
in the general case path. Add a unit test, too.
2025-08-05 13:39:54 -07:00
Peter Klausler
effa35d240
[flang][runtime] Don't always accept a bare exponent letter (#151597)
For more accurate compatibility with other compilers' extensions, accept
a bare exponent letter as valid real input to a formatted READ statement
only in a fixed-width input field. So it works with (G1.0) editing, but
not (G)/(D)/(E)/(F) or list-directed input.

Fixes https://github.com/llvm/llvm-project/issues/151465.
2025-08-05 13:39:33 -07:00
Peter Klausler
aec90f2f27
[flang][runtime] Fix child input bugs under NAMELIST (#151571)
When NAMELIST input takes place on a derived type, we need to preserve
the type in the descriptor that is created for storage sequence
association. Further, the fact that any child list input in within the
context of a NAMELIST must be inherited so that input fields don't try
to consume later "variable=" strings.

Fixes https://github.com/llvm/llvm-project/issues/151222.
2025-08-05 13:39:08 -07:00
Connector Switch
8b7f81f2de
[NFC] Fix assignment typo. (#151864) 2025-08-03 22:32:00 +08:00
Peter Klausler
35cabd69e6
[flang] Support fixed-width input field truncation for LOGICAL (#151203)
As a common extension, we support the truncation of fixed-width fields
for non-list-directed input editing when a separator character (',' or
';' depending on DECIMAL='POINT' or 'COMMA' resp.) appears in the field.
This isn't working for L input editing; fix.

(The bug reports a failure with DC mode, but it doesn't work with a
comma either.)

Fixes https://github.com/llvm/llvm-project/issues/151178.
2025-07-30 11:42:14 -07:00
Peter Klausler
0d6a67c1ad
[flang][runtime] Remove redundant initialization (#150984)
The assignment to mutableModes() in BeginIoStatement() is redundant,
since the mutableModes_ data member is initialized by the constructors
of the two classes that now have one. Remove the assignment to avoid
confusion.

Also restores the original OutputStatementState base class name after a
recent patch that needlessly changed it to something equivalent but less
readable.
2025-07-30 11:41:19 -07:00
Michael Kruse
34ca553d30
[Flang/Flang-RT] Fix OldUnit tests on Windows (#150734)
Flang and Flang-RT have two flavours of unittests: 
1. GTest unittests, using lit's `lit.formats.GoogleTest` format ending
with `Tests${CMAKE_EXECUTABLE_SUFFIX}`
2. "non-GTest" or "OldUnit" unittests, a plain executable ending with
`.test${CMAKE_EXECUTABLE_SUFFIX}`

Both executables are emitted into the same unittests/ subdirectory. When
running ...
1. `tests/Unit/lit.cfg.py`, only considers executable ending with
`Tests` (or `Tests.exe` on Windows), hence skips the non-GTest tests.
2. `tests/NonGtestUnit/lit.cfg.py` considers all tests ending with
`.test` or `.exe`. On Windows, The GTest unitests also end with `.exe`.

In Flang-RT, `.exe` is considered an extension for non-GTest unitests
which causes tests such as Flang's `RuntimeTests.exe` to be executed for
both on Windows. This particular test includes a file write test, using
a hard-coded filename `ucsfile`. If the two instances are executed
concurrently, they might interfere with each other reading/writing
`ucsfile` which results in a flaky test.

This patch avoids the redundant execution by requiring the suffix
`.test.exe` on Windows. lit has to be modified because it uses
`os.path.splitext` the extract the extension, which would only recognize
the last component. It was changed from the orginal `endswith` in
c865abe747aa72192f02ebfdcabe730f2553e42f
for unknown reasons.

In Flang, `.exe` is not considered a suffix for non-GTest unittests and
hence they are not run at all. Fixing by also added `.test.exe` as valid
suffix, like with Flang-RT.

Unfortunately, the ` Evaluate/real.test.exe` test was failing on
Windows:
```
FAIL: flang-OldUnit :: Evaluate/real.test.exe (3592 of 3592)
******************** TEST 'flang-OldUnit :: Evaluate/real.test.exe' FAILED ********************
..\_src\flang\unittests\Evaluate\real.cpp:511: FAIL: FlagsToBits(prod.flags) == 0x18, not 0x10
        0 0x800001 * 0xbf7ffffe
..\_src\flang\unittests\Evaluate\real.cpp:511: FAIL: FlagsToBits(prod.flags) == 0x18, not 0x10
        0 0x800001 * 0x3f7ffffe
..\_src\flang\unittests\Evaluate\real.cpp:511: FAIL: FlagsToBits(prod.flags) == 0x18, not 0x10
        0 0x80800001 * 0xbf7ffffe
..\_src\flang\unittests\Evaluate\real.cpp:511: FAIL: FlagsToBits(prod.flags) == 0x18, not 0x10
        0 0x80800001 * 0x3f7ffffe
...
```
This is due to the `__x86_64__` macro not being set by Microsoft's
cl.exe and hence floating point status flags not being read out. The
equivalent macro for Microsofts compiler is `_M_X64` (or `_M_X64`).
2025-07-26 23:47:36 +02:00
David Truby
c20a95a7dd
[flang-rt] Remove hard-coded dependency on compiler-rt path on Windows (#150244)
This fixes an issue where if the build folder is no longer present flang
cannot link anything on Windows because the path to compiler-rt in the
binary is hard-coded. Flang already links compiler-rt on Windows so it
isn't necessary for flang-rt to specify that it depends on compiler-rt
at all, other than for the unit tests, so instead we can move that logic
into the unit test compile lines.
2025-07-26 14:42:42 +01:00
Peter Klausler
f6e70c7d47
[flang][runtime] Handle ';' in fixed-width input field (#150512)
Formatted input of real values can handle a ',' field separator when one
appears in an fixed-width input field, but can't cope with a semicolon
under DECIMAL='COMMA'. Fix.

Fixes https://github.com/llvm/llvm-project/issues/150047.
2025-07-25 14:48:53 -07:00
Peter Klausler
918d6db329
[flang][runtime] Refine state associated with child I/O (#150461)
Child I/O state needs to carry a pointer to the original non-type-bound
defined I/O subroutine table, so that nested defined I/O can call those
defined I/O subroutines. It also needs to maintain a mutableModes
instance for the whole invocation of defined I/O, instead of having a
mutableModes local to list-directed child I/O, so that a top-level data
transfer statement with (say) DECIMAL='COMMA' propagates that setting
down to nested child I/O data transfers.

Fixes https://github.com/llvm/llvm-project/issues/149885.
2025-07-25 14:48:31 -07:00
Peter Klausler
f6a6cdd15c
[flang][runtime] Fix formatted input of NAN(...) (#149606)
Formatted real input is allowed to have parenthesized information after
"NAN". We don't interpret the contents, but we should at least scan the
information correctly.

Fixes https://github.com/llvm/llvm-project/issues/149533 and
https://github.com/llvm/llvm-project/issues/150035.
2025-07-25 14:47:26 -07:00
Valentin Clement (バレンタイン クレメン)
283fd3f09a
[flang][cuda] Use get() to get raw pointer (#150205)
Fix issue reported in #150136. `createAllocatable` returns an OwingPtr.
Use `get()` to get the raw pointer has it is done in the
`flang-rt/unittests/Runtime/CUDA/Memory.cpp` tests.
2025-07-23 20:01:15 +09:00
Valentin Clement (バレンタイン クレメン)
d1ca984757
[flang][cuda] Fix unittest (#150136) 2025-07-23 09:08:38 +09:00
Peter Klausler
9e5b2fbe86
[flang][runtime] Preserve type when remapping monomorphic pointers (#149427)
Pointer remappings unconditionally update the element byte size and
derived type of the pointer's descriptor. This is okay when the pointer
is polymorphic, but not when a pointer is associated with an extended
type.

To communicate this monomorphic case to the runtime, add a new entry
point so as to not break forward binary compatibility.
2025-07-18 13:45:05 -07:00
Peter Klausler
680b8dd707
[flang][runtime] Handle spaces before ')' in alternative list-directe… (#149384)
…d complex input

List-directed reads of complex values that can't go through the usual
fast path (as in this bug's test case, which uses DECIMAL='COMMA')
didn't skip spaces before the closing right parenthesis correctly.

Fixes https://github.com/llvm/llvm-project/issues/149164.
2025-07-18 13:44:44 -07:00
Peter Klausler
97a8476068
[flang][runtime] Further work on speeding up work queue operations (#149189)
This patch avoids a trip through the work queue engine for cases on a
CPU where finalization and destruction actions during assignment were
handled without enqueueing another task.
2025-07-18 13:44:25 -07:00
Daniel Chen
4bf4e87576
Static_cast std::size_t to build flang_rt in 32-bit. (#149529) 2025-07-18 14:14:27 -04:00
Peter Klausler
bbcdad1f8e
[flang][runtime] MCLOCK library routine (#148960)
Add MCLOCK as an interface to std::clock().
2025-07-16 09:10:07 -07:00
Peter Klausler
52a46dc57f
[flang] Allow -fdefault-integer-8 with defined I/O (#148927)
Defined I/O subroutines have UNIT= and IOSTAT= dummy arguments that are
required to have type INTEGER with its default kind. When that default
kind is modified via -fdefault-integer-8, calls to defined I/O
subroutines from the runtime don't work.

Add a flag to the two data structures shared between the compiler and
the runtime support library to indicate that a defined I/O subroutine
was compiled under -fdefault-integer-8. This has been done in a
compatible manner, so that existing binaries are compatible with the new
library and new binaries are compatible with the old library, unless of
course -fdefault-integer-8 is used.

Fixes https://github.com/llvm/llvm-project/issues/148638.
2025-07-16 09:09:49 -07:00
Valentin Clement (バレンタイン クレメン)
9e9fdd433a
[flang][cuda] Fix definition of CUFSetAllocatorIndex (#148778) 2025-07-14 21:26:43 -07:00
Valentin Clement (バレンタイン クレメン)
2c6771889a
[flang][cuda] Introduce cuf.set_allocator_idx operation (#148717) 2025-07-14 17:23:18 -07:00
Peter Klausler
40ceaf1d99
[flang][runtime] Fix bad instance of std::optional in runtime (#148724)
The runtime needs to use common::optional, not std::optional.
2025-07-14 14:12:49 -07:00
Peter Klausler
2e53a68c09
[flang][runtime] Speed up initialization & destruction (#148087)
Rework derived type initialization in the runtime to just initialize the
first element of any array, and then memcpy it to the others, rather
than exercising the per-component paths for each element.

Reword derived type destruction in the runtime to detect and exploit a
fast path for allocatable components whose types themselves don't need
nested destruction.

Small tweaks were made in hot paths exposed by profiling in descriptor
operations and derived type assignment.
2025-07-14 11:14:02 -07:00
Valentin Clement (バレンタイン クレメン)
aec3016b64
[flang][cuda] Use minor version in flang_rt.cuda lib name (#148085)
Add minor version in the lib name to be able to distinguish between
specific version.
2025-07-11 15:49:34 -07:00
Valentin Clement (バレンタイン クレメン)
f642b63412
[flang][cuda] Update condition in descriptor data transfer (#148306)
When the two descriptor have the same number of elements and are
contiguous, the transfer can be done via pointers.
2025-07-11 15:32:04 -07:00
agozillon
75f81ded8f
[Flang][FlangRT][Runtime] Add RT_OFFLOAD_API_GROUP_BEGIN to missing symbols on AMDGPU (#147612)
After the recent move to work queues, in certain cases when linking in
the fortran runtime built for offload on AMDGPU as required in certain
cases, we'll get missing symbols when linking. This PR tries to address
this issue by encompassing more of the library in
RT_OFFLOAD_API_GROUP_BEGIN, which has the affect of compiling these
functions for AMDGPU, resolving the missing symbols.

This PR should address the following issue:
https://github.com/llvm/llvm-project/issues/145888
2025-07-10 13:19:58 +02:00
Tom Eccles
fe5d94d85d
[flang-rt] Match compiler-rt's default macos version (#147273)
Followup to https://github.com/llvm/llvm-project/pull/143508

This required adding another alternative implementation of time
intrinsics to match what is available in older MacOS.

With this change, flang can be used to build programs for older versions
of MacOS.

Co-authored-by: David Truby <david.truby@arm.com>
2025-07-09 13:26:44 +01:00
Michael Kruse
4be3e95284
[Flang-RT][Offload] Always use LLVM-built GTest (#143682)
The Offload and Flang-RT had the ability to compile GTest themselves.
But in bootstrapping builds, LLVM_LIBRARY_OUTPUT_INTDIR points to the
same location as the stage1 build. If both are building GTest, they
everwrite each others `libllvm_gtest.a` and `libllvm_test_main.a` which
causes #143134.

This PR removes the ability for the Offload/Flang-RT runtimes to build
their own GTest and instead relies on the stage1 build of GTest. This
was already the case with LLVM_INSTALL_GTEST=ON configurations. For
LLVM_INSTALL_GTEST=OFF configurations, we now also export gtest into the
buildtree configuration. Ultimately, this reduces combinatorial
explosion of configurations in which unittests could be built
(LLVM_INSTALL_GTEST=ON, GTest built by Offload, GTest built by Flang-RT,
GTest built by Offload and also used by Flang-RT).

GTest and therefore Offload/Runtime unittests will not be available if
the runtimes are configured against an LLVM install tree. Since llvm-lit
isn't available in the install tree either, it doesn't matter.

Note that compiler-rt and libc also use GTest in non-default
configrations. libc also depends on LLVM's GTest build (and would
error-out if unavailable), but compiler-rt builds it completely
different.

Fixes #143134
2025-07-09 12:53:33 +02:00
Daniel Chen
b84696db74
Fix the type of offset that broke 32-bit flang-rt build to use uint64_t consistently (#147359)
The recent change of `flang-rt` has code like `std::size_t
offset{offset_};`.
It broke the 32-bit `flang-rt` build because `Component::offset_` is of
type `uint64_t` but `size_t` varies.
Clang complains
```
error: non-constant-expression cannot be narrowed from type 'std::uint64_t' (aka 'unsigned long long') to 'std::size_t' (aka 'unsigned long') in initializer list [-Wc++11-narrowing]
  143 |   std::size_t offset{offset_};
      |                      ^~~~~~~

```

This patch is to use the consistent `uint64_t` for offset.
2025-07-08 10:01:43 -04:00
Peter Klausler
dccc0266f4
[flang][runtime] Allow INQUIRE(IOLENGTH=) in the presence of defined I/O (#144541)
When I/O list items include instances of derived types for which defined
I/O procedures exist, ignore them.

Fixes https://github.com/llvm/llvm-project/issues/144363.
2025-06-30 10:20:39 -07:00
Peter Klausler
2bf3ccabfa
[flang] Restructure runtime to avoid recursion (relanding) (#143993)
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

(Relanding this patch after reverting initial attempt due to some test
failures that needed some time to analyze and fix.)

Fixes https://github.com/llvm/llvm-project/issues/142481.
2025-06-16 14:37:01 -07:00
Peter Klausler
65b06cd983
[flang][runtime] Check SOURCE= conformability on ALLOCATE (#144113)
The SOURCE= expression of an ALLOCATE statement, when present and not
scalar, must conform to the shape of the allocated objects. Check this
at runtime, and return a recoverable error, or crash, when appropriate.

Fixes https://github.com/llvm/llvm-project/issues/143900.
2025-06-16 14:36:35 -07:00
Valentin Clement (バレンタイン クレメン)
9992668404
[flang][cuda] Add runtime check for passing device arrays (#144003) 2025-06-12 20:47:58 -07:00
Peter Klausler
10f512f7bb
Revert runtime work queue patch, it breaks some tests that need investigation (#143713)
Revert "[flang][runtime] Another try to fix build failure"

This reverts commit 13869cac2b5051e453aa96ad71220d9d33404620.

Revert "[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors
(#143650)"

This reverts commit d75e28477af0baa063a4d4cc7b3cf657cfadd758.

Revert "[flang][runtime] Replace recursion with iterative work queue
(#137727)"

This reverts commit 163c67ad3d1bf7af6590930d8f18700d65ad4564.
2025-06-11 07:55:06 -07:00
Peter Klausler
b512077c37
[flang][runtime] Another try to fix build failure (#143702)
Tweak accessibility to try to get code past whatever gcc is being used
by the flang-runtime-cuda-gcc build bot.
2025-06-11 06:34:46 -07:00
Peter Klausler
d75e28477a
[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors (#143650)
Adjust default parent class accessibility to attemp to work around what
appear to be old GCC's interpretation.
2025-06-10 20:36:52 -07:00
Peter Klausler
163c67ad3d
[flang][runtime] Replace recursion with iterative work queue (#137727)
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

The effects of this restructuring on CPU performance are yet to be
measured.
2025-06-10 14:44:19 -07:00
Valentin Clement (バレンタイン クレメン)
9c54512c3e
[flang][cuda] Allocate the dst descriptor in data transfer (#143437)
In a test like: 

```
integer, allocatable, device :: da(:)
allocate(a(200))
a = 2
da = a ! da is not allocated before data transfer is initiated. Allocate it with a
```

The reference compiler will allocate the data for the `da` descriptor so
the data transfer can be done properly.
2025-06-10 09:43:30 -07:00
Peter Klausler
7b9518ae27
[flang][runtime] Accommodate change of type in assignment to allocatable (#141988)
When an assignment to a derived type allocatable requires
(re)allocation, its type may change to that of the right-hand side. The
code didn't update its derived type pointer, leading to the wrong type
being put into the descriptors created for elemental defined assignment
subroutine calls.

Fixes https://github.com/llvm/llvm-project/issues/141835.
2025-06-04 09:22:01 -07:00
Peter Klausler
4c6b60a639
[flang] Extension: allow char string edit descriptors in input formats (#140624)
FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid
format for input as well as for output. The character string edit
descriptor "J=" is interpreted as if it had been 2X on input, causing
two characters to be skipped over. The skipped characters don't have to
match the characters in the literal string. An optional warning is
emitted under control of the -pedantic option.
2025-05-28 13:58:22 -07:00
Valentin Clement (バレンタイン クレメン)
fc9ce037ef
[flang][rt] Enable Count and CountDim for device build (#141684) 2025-05-28 09:55:49 -07:00
Kajetan Puchalski
09a70b1e10
[flang-rt] Explicitly define the default ShallowCopy* templates (#141619)
Not explicitly defining the default case for ShallowCopy* functions does
not meet the requirements for gcc to actually instantiate the templates,
leading to build errors that show up with gcc but not with clang.

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-05-27 16:38:48 +01:00
Kajetan Puchalski
0d464009fe
[flang-rt] Fix usage of kNoAsyncId in assign.cpp (#141077)
Fix a leftover old variable name causing build bot errors.

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-05-22 15:49:03 +01:00
Kajetan Puchalski
c2892b0bdf
[flang-rt] Optimise ShallowCopy and use it in CopyInAssign (#140569)
Using Descriptor.Element<>() when iterating through a rank-1 array is
currently inefficient, because the generic implementation suitable for
arrays of any rank makes the compiler unable to perform optimisations
that would make the rank-1 case considerably faster.

This is currently done inside ShallowCopy, as well as by CopyInAssign,
where the implementation of elemental copies (inside Assign) is
equivalent to ShallowCopyDiscontiguousToDiscontiguous.

To address that, add a DescriptorIterator abstraction specialised for
arrays of various ranks, and use that throughout ShallowCopy to iterate
over the arrays.

Furthermore, depending on the pointer type passed to memcpy, the
optimiser can remove the memcpy calls from ShallowCopy altogether which
can result in substantial performance improvements on its own.
Specialise ShallowCopy for various element pointer types to make these
optimisations possible.

Finally, replace the call to Assign inside CopyInAssign with a call to
newly optimised ShallowCopy.

For the thornado-mini application, this reduces the runtime by 27.7%.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-05-22 15:11:46 +01:00