Corrected various spelling mistakes such as 'occurred', 'receiver',
'initialized', 'length', and others in comments, variable names,
function names, and documentation throughout the project. These
changes improve code readability and maintain consistency in naming
and documentation.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
The build mode has been deprecated in #136314. According to the
deprecation message, it was supposed to be removed in the LLVM 21
release. Each build mode increased the maintanance overhead when
failing, such as in #151117.
Let's remove it in LLVM 22.
Summary:
Recent OpenMP patches have added real support for virtual functions on
the device side. However, we don't provide some of the C++ ABI functions
that are emitted when using these. In practice these functions just call
`std::terminate` so we can just trap here. These are marked weak so they
will be overridden by a more correct implementation if not defined and
will also not extract on their own from a static library.
These commits fix issues regarding storage of tool data within
libomptarget. Both libomp and libomptarget have been modified to
accommodate this. We differentiate between two cases depending on the
type of the target region:
- merged target regions (default, without `nowait` clause): behavior
remains unchanged, tool data is stored in the thread local
RegionInterface class within libomptarget.
- deferred target regions (using `nowait` clause): tool data is moved to
`ompt_task_info_t` struct within libomp, as `RegionInterface` is thread
local and its data is lost whenever another task is scheduled on the
thread, which happens with deferred target regions.
In the new implementation, `RegionInterface` receives pointers to
`ompt_task_info_t` within libomp which are handled transparently within
libomptarget. Thus, the problem of tool data getting lost when a thread
receives a new task is resolved: `target_data` and `target_task_data`
remain set.
Another issue was the value of `task_data` which is supposed to belong
to the generating task of the region according to the OpenMP standard,
but instead had been set to the `task_data` of the target task itself
until now.
Test cases have been added which check both of these fixes.
---------
Co-authored-by: Joachim <jenke@itc.rwth-aachen.de>
This is needed as a way to support older code that was expecting
unconditional attachment to happen for cases like:
```c
int *p;
int x;
#pragma omp targret enter data map(p) // (A)
#pragma omp target enter data map(x) // (B)
p = &x;
// By default, this does NOT attach p and x
#pragma omp target enter data map(p[0:0]) // (C)
```
When the environment variable is set, such maps, where both the pointer
and the pointee already have corresponding copies on the device, but are
not attached to one another, will be attached as-if OpenMP 6.1 TR14's
`attach(always)` map-type-modifier was specified on `(C)`.
When llvm-symbolizer is not found on PATH TSan uses system's addr2line
instead. On Ubuntu 22.04 addr2line can't handle DWARF v5, which results
in failures in some libarcher tests.
This PR adds the directory of the just built LLVM binaries to PATH, to
make llvm-symbolizer available to TSan.
The changes were tested on an AArch64 machine, on which
task-taskgroup-unrelated.c was flaky. Moving the test code to a separate
function, executed 10 times, solved the issue.
Fixes#170138
Reapplication of #137828, changes:
* Workaround CMAKE_Fortran_PREPROCESS_SOURCE issue for CMake < 2.24: The
issue is that `try_compile` does not forward manually-defined compiler
flang variables to the test build environment; instead of just a
negative test result, it aborts the configuration step itself. To be
fair, manually defining these variables is deprecated since at least
CMake 3.6.
* Missing flang cmd line flags for CMake < 3.28 `-target=`, `-O2`, `-O3`
* It is now possible to set FLANG_RT_ENABLED_STATIC=OFF and
FLANG_RT_ENABLE_SHARED=OFF at the same and is the default for amdgpu and
nvptx targets. In this mode, only the .mod files are compiled --
necessary for module files in
lib/clang/22/finclude/flang/(nvptx64-nvidia-cuda|amdgpu-amd-amdhsa)/*.mod
to be available.
* For compiling omp_lib.mod for nvptx and amdgpu, the module build
functionality must be hoisted out if openmp's runtime/ directory which
is only included for host targets. This PR now requires #169909.
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Constants such as [`c_long_double` also have different
values](d748c81218/flang-rt/lib/runtime/iso_c_binding.f90 (L77-L81)).
Most other modules have `#ifdef`-enclosed code as well. For instance
this caused offload targets nvptx64-nvidia-cuda/amdgpu-amd-amdhsa to use
the modules files compiled for the host which may contrain uses of the
types REAL(10) or REAL(16) not available for nvptx/amdgpu.
#146876#128015#129742#158790
3. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
4. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted (PR
#169525).
If using Flang for cross-compilation or target-offloading, flang-rt must
now be compiled for each target not only for the library, but also to
get the target-specific module files. For instance in a bootstrapping
runtime build, this can be done by adding:
`-DLLVM_RUNTIME_TARGETS=default;nvptx64-nvidia-cuda;amdgpu-amd-amdhsa`.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Another subdirectory `flang` avoids accidental inclusion of
gfortran-modules which due to compression would result in
user-unfriendly errors. Now the driver adds `-fintrinsic-module-path`
for that location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
Extracted out of #169638. The motivation is that we want to build
Fortran module files for device triples (amdgpu-amd-amdhsa and
nvptx64-nvidia-cuda) as well, but the runtimes/ directory is only
included for host devices.
This PR itself should not change anything, including that omp_lib.mod is
only built on host devices triple. Some dependencies used for building
omp_lib.mod are hoisted out of runtimes/ as well. IMHO they all make
sense to hoist, e.g. LIBOMP_VERSION_MAJOR/LIBOMP_VERSION_MINOR should be
usable in the entire OpenMP subproject.
When a critical construct has finished, it will trigger a
critical-released event. If a tool is attached, and the `mutex_released`
callback was registered, the tool with receive an event containing the
`codeptr_ra`, the return address of the callback invocation.
All the way back in 82e94a593433f36734e2d34898d353a2ecb65b8b, this
`codeptr_ra` was implemented by calling `__ompt_load_return_address`
with a fixed global thread id of `0`. However, this approach results in
a race-condition, and can yield incorrect results to the tool.
`__ompt_load_return_address(0)` points to the current return address of
the thread 0 in `__kmp_threads`. This thread may already execute some
other construct. A tool might therefore receive the return address of
e.g. some `libomp` internals, or other parts of the user code.
Additionally, a call to `__ompt_load_return_address` resets the
`th.ompt_thread_info.return_address` to `NULL`, therefore also affecting
the return address of thread 0. Another dispatched event, e.g.
parallel-begin might therefore not transfer any `codeptr_ra`.
To fix this, replace the fixed thread id by the `global_tid`, which is
stored just before dispatching the `mutex_released` callback.
Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Most other modules have `#ifdef`-enclosed code as well.
2. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
3. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Now the driver adds `-fintrinsic-module-path` for that
location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
Clang is adding support for the new `OpenMP transparent` clause on
`task` and `taskloop` directives.
The parsing and semantic handling for this clause is introduced in
https://github.com/llvm/llvm-project/pull/166810 .
To allow the compiler to communicate this clause to the `OpenMP`
runtime, a dedicated bit in `kmp_tasking_flags` is required.
This patch adds a new compiler-reserved bit `transparent` to the`
kmp_tasking_flags` structure.
Post-commit fix of #164794 reported at
https://github.com/llvm/llvm-project/pull/164794#issuecomment-3536253493
`LLVM_LIBRARY_OUTPUT_INTDIR` and `LLVM_RUNTIME_OUTPUT_INTDIR` is used by
`AddLLVM.cmake` as output directories. Unless we are in a
bootstrapping-build, It must not point to directories found by
`find_package(LLVM)` which may be read-only directories. MLIR for
instance sets thesese variables to its own build output
directory, so should the runtimes.
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.
On AIX, it generates libomp for both static and dynamic. There is no
need to create symbolic links to libomp.so.
---------
Co-authored-by: Xing Xue <xingxue@outlook.com>
When OMPT is enabled, the stack pointer was not saved to frame pointer
register immediately after storing the frame pointer to the stack.
Therefore the frame pointers did not constitute a proper chain.
Fixes [#163352](https://github.com/llvm/llvm-project/issues/163352)
Implementation files using the Intel syntax explicitly specify it.
Do the same for the few files using AT&T syntax.
This also enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's Clang config files
(i.e. a global preference for Intel syntax).
No functional change intended.
Add equality op which checks 'Kind'
- For now this seems more reasonable than defaulting to true
Chose to keep toString and equality unit tests separate
The registration of this callback handler was disabled for some reason.
Local testing did not bring up any issues when I enabled it.
Side effect is: Silences current warning about unused function.
Add support for the standalone OpenMP tile construct:
```f90
!$omp tile sizes(...)
DO i = 1, 100
...
```
This is complementary to #143715 which added support for the tile
construct as part of another loop-associated construct such as
worksharing-loop, distribute, etc.
Summary:
Forgot to port this option's old handling from offload. It's not way
easier since they're built in the same CMake project. Also delete the
leftover directory that's not used anymore, don't know how that was
still there.
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Enable the generation of no-loop kernels for Fortran OpenMP code. target
teams distribute parallel do pragmas can be promoted to no-loop kernels
if the user adds the -fopenmp-assume-teams-oversubscription and
-fopenmp-assume-threads-oversubscription flags.
If the OpenMP kernel contains reduction or num_teams clauses, it is not
promoted to no-loop mode.
The global OpenMP device RTL oversubscription flags no longer force
no-loop code generation for Fortran.
Only enable Fortran tests when either the test compiler is set
explicitly, or in a runtimes bootstrapping build. A system-installed
Flang either may not exist, or too old to compiler our tests.
Fixes buildbot failure after landing #150722https://lab.llvm.org/buildbot/#/builders/10/builds/13905
In addition to existing C/C++ tests, add Fortran-based tests. Fortran
tests will only run if a Fortran compiler is found. The first test is
for the unroll construct added in #144785.
Summary:
I made the GPU flags accept more of the default LLVM warnings, which
triggered some new cases. Clean those up and fix some other ones while
I'm at it.
Summary:
The AMDGPU hack can be removed, and we no longer need to skip 90% of the
`HandleLLVMOptions` if we work around NVPTX earlier. Simplifies the
interface by removing duplicated logic and keeps the GPU targets from
being weirdly divergent on some flags.