Introduce common infrastructure for runtimes that determines compiler
resource path locations. These variables introduced are:
* RUNTIMES_OUTPUT_RESOURCE_DIR
* RUNTIMES_INSTALL_RESOURCE_PATH
That contain the location for the compiler resource path (typically
`lib/clang/<version>`) in the build tree and the install tree (the
latter relative to CMAKE_INSTALL_PREFIX).
Additionally, define
* RUNTIMES_OUTPUT_RESOURCE_LIB_DIR
* RUNTIMES_INSTALL_RESOURCE_LIB_PATH
as for the location of clang/flang version-locked libraries (typically
`lib${LLVM_LIBDIR_SUFFIX}/<targer-triple>`, but also depends on `APPLE`
and `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR`). This code is moved from
flang-rt and initially becomes its only user.
Refactored out of #171610 as requested
[here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2687382481).
Extracted `get_runtimes_target_libdir_common` from compiler-rt as
requested
[here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2689565634).
Added TODO comments to all runtimes as requested
[here](https://github.com/llvm/llvm-project/pull/171610#issuecomment-3789598635).
Summary:
This PR primarily changes using `LLVM_RUNTIMES_TARGET` to
`LLVM_DEFAULT_TARGET_TRIPLE`. The reason is that the default target
triple is the true cross-compiling architecture we are using, while the
runtimes_target can contain multilib strings like `+debug` or similar.
Additionally add the proper path handling to the OpenMP / Offload
libraries.
If a project depends on the Flang runtime and on libc++, linking fails
because `std::__libcpp_verbose_abort` is defined in both libraries.
Avoid that duplicate definition by defining `_LIBCPP_VERBOSE_ABORT`
before including any C++ headers and by renaming that symbol in the
Flang runtime to `flang_rt_verbose_abort`.
The function that is modified was originally introduced in D158957 to
solve an undefined symbol error when linking pure-Fortran projects with
the Flang runtime.
Providing a definition for that symbol in the Flang runtime might work
correctly for ELF or Mach-O if that symbol has weak linkage in libc++.
But at least for COFF, this now causes multiple-definition errors for
projects that are linking to the Flang runtime and to libc++.
The linker errors before this change for Windows/MinGW using
Clang+Flang+lld look like this:
```
ld.lld: error: duplicate symbol: std::__1::__libcpp_verbose_abort(char const*, ...)
>>> defined at libflang_rt.runtime.a(io-api-minimal.cpp.obj)
>>> defined at libc++.dll.a(libc++.dll)
```
Summary:
This PR provides the minimal support for Fortran I/O coming from a GPU
in OpenMP offloading. We use the same support the `libc` uses for its
printing through the RPC server. The helper functions `rpc::dispatch`
and `rpc::invoke` help make this mostly automatic.
Becaus Fortran I/O is not reentrant, the vast majority of complexity
comes from needing to stitch together calls from the GPU until they can
be executed all at once. This is needed not only because of the
limitations of recursive I/O, but without this the output would all be
interleaved because of the GPU's lock-step execution.
As such, the return values from the intermediate functions are
meaningless, all returning true. The final value is correct however. For
cookies we create a context pointer on the server to chain these
together.
Works on both my AMD and NVIDIA GPUs.
```fortran
program hello_gpu
implicit none
!$omp target teams num_teams(1)
!$omp parallel num_threads(2)
! Print strings
print *, "Hello from GPU"
!$omp end parallel
!$omp end target teams
end program hello_gpu
```
```console
> flang hello.f90 -O2 -fopenmp --offload-arch=gfx1030
> ./a.out
Hello from GPU
Hello from GPU
> flang hello.f90 -O2 -fopenmp --offload-arch=sm_89
> ./a.out
Hello from GPU
Hello from GPU
```
Reapplication of #137828, changes:
* Workaround CMAKE_Fortran_PREPROCESS_SOURCE issue for CMake < 2.24: The
issue is that `try_compile` does not forward manually-defined compiler
flang variables to the test build environment; instead of just a
negative test result, it aborts the configuration step itself. To be
fair, manually defining these variables is deprecated since at least
CMake 3.6.
* Missing flang cmd line flags for CMake < 3.28 `-target=`, `-O2`, `-O3`
* It is now possible to set FLANG_RT_ENABLED_STATIC=OFF and
FLANG_RT_ENABLE_SHARED=OFF at the same and is the default for amdgpu and
nvptx targets. In this mode, only the .mod files are compiled --
necessary for module files in
lib/clang/22/finclude/flang/(nvptx64-nvidia-cuda|amdgpu-amd-amdhsa)/*.mod
to be available.
* For compiling omp_lib.mod for nvptx and amdgpu, the module build
functionality must be hoisted out if openmp's runtime/ directory which
is only included for host targets. This PR now requires #169909.
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Constants such as [`c_long_double` also have different
values](d748c81218/flang-rt/lib/runtime/iso_c_binding.f90 (L77-L81)).
Most other modules have `#ifdef`-enclosed code as well. For instance
this caused offload targets nvptx64-nvidia-cuda/amdgpu-amd-amdhsa to use
the modules files compiled for the host which may contrain uses of the
types REAL(10) or REAL(16) not available for nvptx/amdgpu.
#146876#128015#129742#158790
3. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
4. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted (PR
#169525).
If using Flang for cross-compilation or target-offloading, flang-rt must
now be compiled for each target not only for the library, but also to
get the target-specific module files. For instance in a bootstrapping
runtime build, this can be done by adding:
`-DLLVM_RUNTIME_TARGETS=default;nvptx64-nvidia-cuda;amdgpu-amd-amdhsa`.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Another subdirectory `flang` avoids accidental inclusion of
gfortran-modules which due to compression would result in
user-unfriendly errors. Now the driver adds `-fintrinsic-module-path`
for that location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Most other modules have `#ifdef`-enclosed code as well.
2. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
3. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Now the driver adds `-fintrinsic-module-path` for that
location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
Define the _GLIBCXX_THROW_OR_ABORT macro to not use its _EXC argument. _EXC may contain an expression constructing an std::exception object which is non-inline and therefore require a link dependency on the libstdc++ runtime. In -fno-exceptions builds it is typically optimized away when appearing in unreachable code, but is still present when compiling with -O0 when compiling with Clang.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Clang on Darwin enables non-POSIX extensions by default.
This causes some macros to leak, such as HUGE from <math.h>, which
causes some conflicts with Flang symbols (but not with Flang-RT, for
now).
It also causes some Flang-RT extensions to be disabled, such as FDATE,
that checks for _POSIX_C_SOURCE. Setting _POSIX_C_SOURCE avoids these
issues. This is already being done in Flang, but it was not ported to
Flang-RT.
This also fixes check-flang-rt on Darwin, as NoArgv.FdateNotSupported
is broken since the flang runtime was moved to flang-rt.
Fixes#82036
If the macro was previously defined with `-Wp,-D` then a later `-U` is
*not* going to take effect, the `-Wp` flag takes precedence.
Instead use `-Wp,-U`. This works regardless of whether the original
definition was provided via `-D` or `-Wp,-D`.
Also make sure these flags only get passed to the C++ compiler -- they
are only relevant there, and flang does not support `-Wp`.
Since GCC 15.1, libstdc++ enabled assertions/hardening by default in
non-optimized (-O0) builds [1]. That is, _GLIBCXX_ASSERTIONS is defined
in the libstdc++ headers itself so defining/undefining it on the
compiler command line no longer has an effect in non-optimized builds.
As the commit message[2] suggests, define _GLIBCXX_NO_ASSERTIONS
instead.
For libstdc++ headers before 15.1, -U_GLIBCXX_ASSERTIONS still has to be
on the command line as well.
Defining _GLIBCXX_NO_ASSERTIONS was previously proposed in #152223
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112808
[2] 361d230fd7
This fixes an issue where if the build folder is no longer present flang
cannot link anything on Windows because the path to compiler-rt in the
binary is hard-coded. Flang already links compiler-rt on Windows so it
isn't necessary for flang-rt to specify that it depends on compiler-rt
at all, other than for the unit tests, so instead we can move that logic
into the unit test compile lines.
Followup to https://github.com/llvm/llvm-project/pull/143508
This required adding another alternative implementation of time
intrinsics to match what is available in older MacOS.
With this change, flang can be used to build programs for older versions
of MacOS.
Co-authored-by: David Truby <david.truby@arm.com>
Summary:
This patch adds initial support for compiling `flang-rt` directly for
the GPU. The method used here matches what's already done for `libc` and
`libc++` for the GPU and builds off of those projects.
Mainly this requires setting up some flags and setting the sources that
currently work. This will deposit the resulting library in the
appropriate directory. These files are then intended to be linked via
`-Xoffload-linker` support in the offloading driver.
```
lib/clang/21/lib/nvptx64-nvidia-cuda/libflang_rt.runtime.a
lib/clang/21/lib/amdgcn-amd-amdhsa/libflang_rt.runtime.a
```
This is obviously missing a lot of functions, mainly the `io` support.
Most of what we cannot support is due to using POSIX things that just
don't make sense on the GPU. Stuff like `pthreads` or `sema`.
Getting unit tests to run on this will also be a challenge. We could run
tests the same way we do with `libc`, but the problem there is that the
`libc` test suite is freestanding while `gtest` currently doesn't
compile on the GPU bcause it uses a lot of weird stuff. If the unit
tests were simply `int main` then it would work.
I don't understand the actual runtime code very well, I'd appreciate
some guidance on how to actually support Fortran IO from this interface.
As I understand it, Fortran IO requires a stack-like operation, which
conflicts with the SIMT model GPUs use. Worst case scenario we could
burn some LDS to keep a stack, or serialize it somehow since we can
always just iterate over all the active lanes.
Building this right now looks like this, which depends on the arguments
added in https://github.com/llvm/llvm-project/pull/131695.
```
-DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=compiler-rt;libc;libcxx;libcxxabi;flang-rt \
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=compiler-rt;libc;libcxx;libcxxabi;flang-rt \
-DRUNTIMES_nvptx64-nvidia-cuda_FLANG_RT_LIBC_PROVIDER=llvm \
-DRUNTIMES_nvptx64-nvidia-cuda_FLANG_RT_LIBCXX_PROVIDER=llvm \
-DRUNTIMES_amdgcn-amd-amdhsa_FLANG_RT_LIBC_PROVIDER=llvm \
-DRUNTIMES_amdgcn-amd-amdhsa_FLANG_RT_LIBCXX_PROVIDER=llvm
```
Summary:
This patch adds an interface that uses an in-tree build of LLVM's libc
and libc++.
This is done using the `-DFLANG_RT_LIBC_PROVIDER=llvm` and
`-DFLANG_RT_LIBCXX_PROVIDER=llvm` options. Using `libc` works in terms
of CMake, but the LLVM libc is not yet complete enough to compile all
the files.
This patch is to explicitly link the pthread library when building
shared flang-rt.
On AIX, for example, it needs to link in `libpthread.a` explicitly in
order to resolve the references to those `pthread_*` functions in
`include/flang-rt/runtime/lock.h`
Under non-Windows platforms, also create a dynamic library version of
the runtime. Build of either version of the library can be switched on
using FLANG_RT_ENABLE_STATIC=ON respectively FLANG_RT_ENABLE_SHARED=ON.
Default is to build only the static library, consistent with previous
behaviour. This is because the way the flang driver invokes the linker,
most linkers choose the dynamic library by default, if available.
Building the dynamic library therefore causes flang-built executables to
depend on `libflang_rt.so`, unless explicitly told otherwise.
Extract Flang's runtime library to use the LLVM_ENABLE_RUNTIME
mechanism. It will only become active when
`LLVM_ENABLE_RUNTIMES=flang-rt` is used, which also changes the
`FLANG_INCLUDE_RUNTIME` to `OFF` so the old runtime build rules do not
conflict. This also means that unless `LLVM_ENABLE_RUNTIMES=flang-rt` is
passed, nothing changes with the current build process.
Motivation:
* Consistency with LLVM's other runtime libraries (compiler-rt, libc,
libcxx, openmp offload, ...)
* Allows compiling the runtime for multiple targets at once using the
LLVM_RUNTIME_TARGETS configuration options
* Installs the runtime into the compiler's per-target resource directory
so it can be automatically found even when cross-compiling
Also see RFC discussion at
https://discourse.llvm.org/t/rfc-use-llvm-enable-runtimes-for-flangs-runtime/80826