Detect cmd.exe special status code 9009 that indicates "command not
found" condition. Crash the process if "command not found" detected when
CMDSTAT was not specified.
If a comment appears immediately after a logical value in a NAMELIST
file, the flang runtime returns IostatGenericError. No error occurs when
a space preceeds the exclamation point. Add code to handle a comment
while parsing logical values.
Co-authored-by: John Otken john.otken@hpe.com
EXECUTE_COMMAND_LINE() without CMDSTAT initiated termination in runtime
if the command returned non-zero status code. For example,
EXECUTE_COMMAND_LINE('false') on Linux would cause "fatal Fortran
runtime error... : Command line execution failed with exit code: 1."
This is too strict: EXECUTE_COMMAND_LINE() successfully called 'false',
it's just 'false' happened to return non-zero status code. ifx and
gfortran don't initiate termination in such case. Changed
EXECUTE_COMMAND_LINE() implementation to behave in similar fashion.
Also during testing discovered that when the output of the program that
uses EXECUTE_COMMAND_LINE(... WAIT=.false.) is piped to a file, the
resulting file has duplicated output lines. This was because fork()
command also ends up duplicating parent's buffered output to the child.
Added flush of all units and C stdio before calling fork().
The ISO Fortran standard requires that numeric output editing produce
the full word "Infinity", rather than my current "Inf", when the output
field is wide enough to hold it. Comply.
This implements the TOKENIZE intrinsic per the Fortran 2023 Standard.
TOKENIZE is a more complicated addition to the flang intrinsics, as it
is the first subroutine that has multiple unique footprints. Intrinsic
functions have already addressed this challenge, however subroutines and
functions are processed slightly differently and the function code was
not a good 1:1 solution for the subroutines. To solve this the function
code was used as an example to create error buffering within the
intrinsics Process and select the most appropriate error message for a
given subroutine footprint.
A simple FIR compile test was added to show the proper compilation of
each case. A thorough negative path test has also been added, ensuring
that all possible errors are reported as expected.
Testing prior to commit:
= check-flang ==========================================
```
Testing Time: 139.51s
Total Discovered Tests: 4153
Unsupported : 77 (1.85%)
Passed : 4065 (97.88%)
Expectedly Failed: 11 (0.26%)
FLANG Container Test completed 2 minutes (160 s).
Total Time: 2 minutes (160 s)
Completed : Wed Feb 11 04:05:50 PM CST 2026
```
= check-flang-rt ==========================================
```
Testing Time: 1.55s
Total Discovered Tests: 258
Passed: 258 (100.00%)
FLANG Container Test completed 0 minutes (55 s).
Total Time: 0 minutes (56 s)
Completed : Wed Feb 11 04:08:32 PM CST 2026
```
= llvm-test-suite ==========================================
```
Testing Time: 1886.64s
Total Discovered Tests: 6926
Passed: 6926 (100.00%)
CCE SLES Container debug compile completed 31 minutes (1895 s).
CCE SLES Container debug install completed in 0 minutes (0 s).
Total Time: 31 minutes (1895 s)
Completed : Wed Feb 11 05:46:52 PM CST 2026
```
Additionally, (FYI) an executable test has been written and will be
added to the llvm-test-suite under a separate PR.
---------
Co-authored-by: Kevin Wyatt <kwyatt@hpe.com>
Summary:
This enables primarily `stop.cpp` and `descriptor.cpp`. Requires a
little bit of wrangling to get it to compile. Unlike the CUDA build,
this build uses an in-tree libc++ configured for the GPU. This is
configured without thread support, environment, or filesystem, and it is
not POSIX at all. So, no mutexes, pthreads, or get/setenv.
I tested stop, but i don't know if it's actually legal to exit from
OpenMP offloading.
Add specific lowering and entry point for cudaStreamDestroy. Since we
keep associated stream for some allocation, we need to reset it when the
stream is destroy so we don't use it anymore.
This is a follow up on #182635
It was suggested to place `static_assert(std::is_trivially_destructible_c<A>)`
for the `OwningPtr` class. This cannot be done, because there are
non-trivially destructible types used with `OwnerPtr` (e.g. lots of types
that inherit from `IoErrorHandler`, which is not trivially destructible).
This patch brings back the desctructor call into `OwningPtr::delete_ptr`
just to be on the safe side (though, I do not think we had any memory
leaks even without the destructor call), and removes the cyclic
dependency for the `~ChildIo()` caused by `previous_` member.
ASYNCHRONOUS="YES" is not permitted for either a parent or child data
transfer statement in ISO Fortran (F'2023 12.6.4.8.3 p19). Not that it
matters much -- we don't support true asynchronous I/O anyway -- but
someday we might, and in the meantime it's nice to be able to pass tests
that check conformance.
Add an environment variable (FORT_NO_EMPTY_ALLOCATION) that, when set to
1, changes the behavior of an ALLOCATE statement so that it will fail on
an empty allocation rather than its default behavior of allocating one
byte.
Summary:
Expands on the previous support to enable formatted output, characters,
and checking basic iostat. We intentionally do not handle cases where
the descriptor is non-null as this is a non-trivial class that cannot
easily be shepherded across the wire.
This fixes the test on MacOS. Without this change the SDK sysroot is not
set and so the library path is incorrect and the 'System' library cannot
be found.
Test with https://github.com/llvm/llvm-project/pull/182501 so that the
sysroot variable is correctly set.
Assisted-by: Codex
Summary:
This PR provides the minimal support for Fortran I/O coming from a GPU
in OpenMP offloading. We use the same support the `libc` uses for its
printing through the RPC server. The helper functions `rpc::dispatch`
and `rpc::invoke` help make this mostly automatic.
Becaus Fortran I/O is not reentrant, the vast majority of complexity
comes from needing to stitch together calls from the GPU until they can
be executed all at once. This is needed not only because of the
limitations of recursive I/O, but without this the output would all be
interleaved because of the GPU's lock-step execution.
As such, the return values from the intermediate functions are
meaningless, all returning true. The final value is correct however. For
cookies we create a context pointer on the server to chain these
together.
Works on both my AMD and NVIDIA GPUs.
```fortran
program hello_gpu
implicit none
!$omp target teams num_teams(1)
!$omp parallel num_threads(2)
! Print strings
print *, "Hello from GPU"
!$omp end parallel
!$omp end target teams
end program hello_gpu
```
```console
> flang hello.f90 -O2 -fopenmp --offload-arch=gfx1030
> ./a.out
Hello from GPU
Hello from GPU
> flang hello.f90 -O2 -fopenmp --offload-arch=sm_89
> ./a.out
Hello from GPU
Hello from GPU
```
The unittests `Reductions.InfSums` defines a test array descriptor with
shape 2x3 (i.e. 6 elements), but only provides values for 2 elements.
The result is access of likely uninitialized memory when accessing the
additional 4 elements. In most cases the additional values get gobbled
up by the infinity, but if it happens to be NaN or the negated infinity,
the result becomes NaN and fails the test.
Fix by reducing the shabe of the test array to 2. Fixes the flakyness of
the test of the flang-x86_64-windows buildbot.
Implement `F_C_STRING` to convert a Fortran string to a C
null-terminated string. Documented in F2023 Standard: 18.2.3.9
`F_C_STRING (STRING [, ASIS])`.
When the actual argument associated with the VALUES= dummy argument of
the intrinsic subroutine DATE_AND_TIME has fewer than eight elements, we
crash with an internal error in the runtime.
With this patch, the compiler now checks the size of the vector at
compilation time, when it is known, and gracefully copes with a short
vector at execution time otherwise, without crashing.
During the check for availability of `strerror_r`, the host include file is used. This doesn't matter for AMDGPU since it actually performs the link step during `check_cxx_symbol_exists`. But for NVPTX, due to `-c`, it doesn't link and then incorrectly assumes that the symbol exists.
For now, removing `io-error.cpp` from the list of GPU sources is the most sensible option since it's unused.
When the I/O runtime fails to allocate storage for an I/O unit, the
error message cites a source location in the runtime library, not the
user program. Thread instances of Terminator through to the code that
attempts the allocation so that the failure has a source position in the
user's program.
Reapply #152189 and #174963 which were reverted because it broke
publish-sphinx-docs and publish-doxygen-docs.
The build mode has been deprecated in #136314 and was supposed to be
removed in the LLVM 21 release (#136314).
OpenMP currently supports 4 build modes:
* `cmake <llvm-project>/llvm -DLLVM_ENABLE_PROJECTS=openmp`
* `cmake <llvm-project>/llvm -DLLVM_ENABLE_RUNTIMES=openmp` (bootstrapping build)
* `cmake <llvm-project>/openmp` (standalone build)
* `cmake <llvm-project>/runtimes -DLLVM_ENABLE_RUNTIMES=openmp` (runtimes default/standalone build)
Each build mode increased the maintanance overhead since all build modes
must continue working and user confusion when there do not (see #151117,
#174126, #154117, ...). Let's finally remove it.
This PR proposes to add `Stop` and `ErrorStop` PRIF call procedures to the MIF
dialect. If the `-fcoarray` flag is passed, then all calls to `STOP` and `ERROR
STOP` will use those of PRIF in flang-rt. Thes procedure has been registered
during the initialization (mif::InitOp).
---------
Co-authored-by: Dan Bonachea <dobonachea@lbl.gov>
#175373 introduces a divide-by-zero to create an infinity value, which
is undefined behavior in C++ (C++20 [expr.mul] §4), even when done at
compile-time. This cases flaky test fails with the flang-x86_64-windows
buildbot. Visual Studio 2026 refuses to compile it with error C2124.
Use `std::numeric_limits<float>::infinity()` instead.
Derived-type components that have the `ALLOCATABLE` or `POINTER`
attribute as well as the CUDA `MANAGED` or `UNIFIED` attribute need to
have a specific allocator index set in the descriptor so the allocation
is done correctly. Without this, the allocation is done in host memory
and will trigger illegal read or write if the component is used on the
device. The correct allocator index was set some time ago for the
`DEVICE` attribute but the `MANAGED` and `UNIFIED` attribute need the
same mechanism.
Since the `Component::Genre` has quite some room I opted to add specific
genre for allocatable and pointer with both managed or unified
attribute.
@klausler Let me know if you would prefer another solution. I was
thinking about a separate field but I wanted to avoid wasting some
bytes.
Reapply #152189 which was reverted because it broke publish-sphinx-docs.
The build mode has been deprecated in #136314. According to the
deprecation message, it was supposed to be removed in the LLVM 21
release. Each build mode increased the maintanance overhead when
failing, such as in #151117.
The less accurate clock was being adjusted for twice: once in
`GetSystemClockCountRate` and again in `ConvertTimevalToCount`.
Also adding missing `static` specifiers I noticed whilst reading the
file. I don't know of a way of meaningfully testing this in the
repository, but the code in the ticket now produces the correct result.
Fixes#176505
Corrected various spelling mistakes such as 'occurred', 'receiver',
'initialized', 'length', and others in comments, variable names,
function names, and documentation throughout the project. These
changes improve code readability and maintain consistency in naming
and documentation.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Summary:
We're starting to provide the GPU version of the Fortran runtime with
the GPU cross-build semantics. This does not support tests right now but
will attempt to build the unit tests and fail to find gtest for the GPU.
Disable this for now so it can build.
Summary:
Because we link the `cxx` target directly we do not need to use this
flag, that's also why we pass `-nostdinc++` which makes this an unused
command line flag, hence the warning.
There are six instances of Kahan's extended precision summation
algorithm in flang/flang-rt, and they share a bug: the calculation of
the correction value produces a Nan due to the subtraction Inf-Inf after
the accumulation saturates to Inf. This leads to the surprising Nan
result from SUM([Inf, 0.]).
This bug doesn't affect run-time calculation of SUM when optimization is
enabled -- lowering emits an open-coded SUM that lacks Kahan summation
-- but it does affect compilation-time folding and -O0 runtime results.
Fix the one instance of Kahan summation in the runtime, and consolidate
the other five instances in Evaluate into one new member function, also
corrected.
Fixes https://github.com/llvm/llvm-project/issues/89528.
When building the flang-rt project with the g++ compiler on Linux-X86_64
machine, the compiler gives the following warning:
```
llvm-project/flang-rt/lib/runtime/extensions.cpp:455:26: warning: left shift count is negative [-Wshift-count-negative]
455 | mask = ~(unsigned)0u << ((8 - digits) * 4 + 1);
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
```
All the discussion records see:
https://github.com/llvm/llvm-project/pull/173955
Co-authored-by: liao jun <liaojun@ultrarisc.com>
The build mode has been deprecated in #136314. According to the
deprecation message, it was supposed to be removed in the LLVM 21
release. Each build mode increased the maintanance overhead when
failing, such as in #151117.
Let's remove it in LLVM 22.
When building with `FLANG_RUNTIME_F128_MATH_LIB=libquadmath`, the tests
```
flang-rt :: Driver/ctofortran.f90
flang-rt :: Driver/exec.f90
```
also depend on `libflang_rt.quadmath.a`. Add a dependency to ensure it
is built with `ninja check-flang-rt`.