1574 Commits

Author SHA1 Message Date
c8ef
b57b3f6425
[NFC] Simple typo correction. (#114548) 2024-11-02 00:40:57 +08:00
Ye Luo
eccdb24894
[OpenMP] Create versioned libgomp softlinks (#112973)
Add libgomp.1.dylib for MacOS and libgomp.so.1 for Linux

Linkers on Mac and Linux pick up versioned libgomp dynamic library
files. The existing softlinks (libgomp.dylib for MacOS and libgomp.so
for Linux) are insufficient. This helps alleviate the issue of mixing
libgomp and libomp at runtime.
2024-10-25 13:19:58 -05:00
Shilei Tian
5d07162bba
[OpenMP] Fix the test issue when libomp is built as a static library (#113522) 2024-10-24 12:52:17 -04:00
Luke Drummond
b55c52c047 Revert "Renormalize line endings whitespace only after dccebddb3b80"
This reverts commit 9d98acb196a40fee5229afeb08f95fd36d41c10a.
2024-10-18 21:16:50 +01:00
Josep Pinot
af1e9c81f4
[OpenMP] Fix missing gtid argument in __kmp_print_tdg_dot function (#111986)
This patch modifies the signature of the `__kmp_print_tdg_dot` function
in `kmp_tasking.cpp` to include the global thread ID (gtid) as an
argument. The gtid is now correctly passed to the function.

- Updated the function declaration to accept the gtid parameter.
- Modified all calls to `__kmp_print_tdg_dot` to pass the correct gtid
value.

This change addresses issues encountered when compiling with
`OMPX_TASKGRAPH` enabled. No functional changes are expected beyond
successful compilation.
2024-10-17 10:01:28 -04:00
Luke Drummond
9d98acb196 Renormalize line endings whitespace only after dccebddb3b80
Line ending policies were changed in the parent, dccebddb3b80. To make
it easier to resolve downstream merge conflicts after line-ending
policies are adjusted this is a separate whitespace-only commit. If you
have merge conflicts as a result, you can simply `git add --renormalize
-u && git merge --continue` or `git add --renormalize -u && git rebase
--continue` - depending on your workflow.
2024-10-17 14:49:26 +01:00
Nikita Popov
4722c6b87c
[openmp] Use core_siblings_list if physical_package_id not available (#111831)
On powerpc, physical_package_id may not be available. Currently, this
causes openmp to fall back to flat topology and various affinity tests
fail.

Fix this by parsing core_siblings_list to deterimine which cpus belong
to the same socket. This matches what the testing code does. The code to
parse the CPU list format thankfully already exists.

Fixes https://github.com/llvm/llvm-project/issues/111809.
2024-10-14 09:23:41 +02:00
Xing Xue
c62e61acb4
[libomp][AIX] Use SO version "1" for AIX libomp (#111384)
For `libomp` on AIX, we build shared object `libomp.so` first and then
archive it into libomp.a. This patch changes to use SO version `1` and
name the shared object `libomp.so.1` so that it is consistent with the
naming of other shared objects in AIX libraries, e.g., `libc++.so.1` in
`libc++.a`. With this change, the change made in commit
bde51d9b0d473447ea12fb14924f14ea167eec85 to ensure only `libomp.a` is
published on AIX is no longer necessary and is removed.
2024-10-08 06:04:13 -04:00
Xing Xue
bde51d9b0d
[libomp][AIX] Ensure only libomp.a is published on AIX (#109016)
For `libomp` on AIX, we build shared object `libomp.so` first and then
archive it into `libomp.a`. Due to a CMake for AIX problem, the install
step also tries to publish `libomp.so`. While we use a script to build
`libomp.a` out-of-tree for Clang and avoided the problem, this chokes
the in-tree build for Flang. The issue will be reported to CMake but
before a fixed CMake is available, this patch ensures only `libomp.a` is
published.
2024-09-18 16:12:39 -04:00
Brad Smith
37e109c6f8
[OpenMP] Support setting POSIX thread name on *BSD's and Solaris (#106489) 2024-08-31 16:53:33 -04:00
Hansang Bae
9e0ee0e104
[OpenMP] Add support for pause with omp_pause_stop_tool (#97100)
This patch adds support for pause resource with a new enumerator
omp_pause_stop_tool. The expected behavior of this enumerator is
* omp_pause_resource: not allowed
* omp_pause_resource_all: equivalent to omp_pause_hard
2024-08-15 11:44:50 -05:00
Hansang Bae
5989709047
[OpenMP] Miscellaneous small code improvements (#95603)
Removes a few uninitialized variables, possible resource leaks, and
redundant code.
2024-08-15 10:42:22 -05:00
HighW4y2H3ll
0160d817c2
[OpenMP] Rename worker threads for improved debuggability (#102065)
Rename the worker threads "openmp_worker"

---------

Co-authored-by: h2h <h2h@meta.com>
Co-authored-by: Matthias Braun <matze@braunis.de>
2024-08-13 22:20:11 -07:00
Tulio Magno Quites Machado Filho
0aa22dcd2f
[OpenMP][AArch64] Fix branch protection in microtasks (#102317)
Start __kmp_invoke_microtask with PACBTI in order to identify the
function as a valid branch target. Before returning, SP is
authenticated.
Also add the BTI and PAC markers to z_Linux_asm.S.

With this patch, libomp.so can now be generated with DT_AARCH64_BTI_PLT
when built with -mbranch-protection=standard.

The implementation is based on the code available in compiler-rt.
2024-08-13 15:34:41 -03:00
Alexandre Ganea
20baa9a9ec [openmp][runtime] Silence warnings
This fixes several of those when building with MSVC on Windows:
```
[3625/7617] Building CXX object
projects\openmp\runtime\src\CMakeFiles\omp.dir\kmp_affinity.cpp.obj
C:\src\git\llvm-project\openmp\runtime\src\kmp_affinity.cpp(2637):
warning C4062: enumerator 'KMP_HW_UNKNOWN' in switch of enum 'kmp_hw_t'
is not handled
C:\src\git\llvm-project\openmp\runtime\src\kmp.h(628): note: see
declaration of 'kmp_hw_t'
```
2024-08-11 19:01:12 -04:00
arsnyder16
f7b2c2e49f
[openmp][WebAssembly] Allow openmp to compile and run under emscripten toolchain (#95169)
* Separate wasi and emscripten as they have different constraints and
abilities
* Emscripten mimics Linux/POSIX by statically linking the musl runtime.
This allow nearly all KMP_OS_LINUX code paths to work correctly. There
are only a few places that need to be adjusted related to dynamic
linking (dl_open)
* Internally link openmp globals
* With CommonLinkage it is needed to emit them in an assembly file, now
they are defined and used within each compilation unit
* With ExternalLinkage they suffer from duplicate symbols during linking
for unnamed globals like reduction/critical
   * Interestingly this aligns with the TODO comment above this code
2024-08-07 13:00:37 -05:00
Jonathan Peyton
916a91578f
[OpenMP] Assign thread ids in the cpuinfo topology method (#91013)
On non-hyperthreaded machines, the thread id is not always explicit in
the /proc/cpuinfo file. This patch adds a check to ensure the thread ids
are put in.
2024-07-29 09:52:02 -05:00
Jonathan Peyton
77ff969e5d
[OpenMP] Add topology and affinity changes for Meteor Lake (#91012)
These are Intel-specific changes for the CPUID leaf 31 method for
detecting machine topology.

* Cleanup known levels usage in x2apicid topology algorithm
Change to be a constant mask of all Intel topology type values.

* Take unknown ids into account when sorting them
If a hardware id is unknown, then put further down the hardware thread
list so it will take last priority when assigning to threads.

* Have sub ids printed out for hardware thread dump

* Add caches to topology 
New` kmp_cache_ids_t` class helps create cache ids which are then put
into the topology table after regular topology type ids have been put
in.

* Allow empty masks in place list creation
Have enumeration information and place list generation take into account
that certain hardware threads may be lacking certain layers

* Allow different procs to have different number of topology levels
Accommodates possible situation where CPUID.1F has different depth for
different hardware threads. Each hardware thread has a topology
description which is just a small set of its topology levels. These
descriptions are tracked to see if the topology is uniform or not.

* Change regular ids with logical ids
Instead of keeping the original sub ids that the x2apicid topology
detection algorithm gives, change each id to its logical id which is a
number: [0, num_items - 1]. This makes inserting new layers into the
topology significantly simpler.

* Insert caches into topology
This change takes into account that most topologies are uniform and
therefore can use the quicker method of inserting caches as equivalent
layers into the topology.
2024-07-29 09:51:42 -05:00
Jonathan Peyton
2e57e63666
[OpenMP][libomp] Fix tasking debug assert (#95823)
The debug assert is meant to check that the index is a valid which means
the runtime needs to check against the size of the array instead of the
number of threads. A free()-ed thread put back in the thread pool may
index into anywhere inside the task team's available array from 0 to
tt_max_threads potentially.

Fixes: #94260
2024-07-24 12:25:21 -05:00
Michael Kruse
5c93a94f5a
[Clang][OpenMP] Add interchange directive (#93022)
Add the interchange directive which will be introduced in the upcoming
OpenMP 6.0 specification. A preview has been published in [Technical
Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).
2024-07-19 09:24:40 +02:00
Michael Kruse
80865c01e1
[Clang][OpenMP] Add reverse directive (#92916)
Add the reverse directive which will be introduced in the upcoming
OpenMP 6.0 specification. A preview has been published in [Technical
Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).

---------

Co-authored-by: Alexey Bataev <a.bataev@outlook.com>
2024-07-18 10:35:32 +02:00
Hansang Bae
7a72856af8
[OpenMP] Use new OMPT state and sync kinds for barrier events (#95602)
This change makes the runtime use new OMPT state and sync kinds
introduced in OpenMP 5.1 in place of the deprecated implicit state and
sync kinds. Events from implicit barriers use different enumerators for
workshare, parallel, and teams.
2024-07-16 09:52:20 -05:00
Alexandre Ganea
be26e54542 [openmp] Silence warning when building the x64 Windows LLVM release package
This fixes:
```
MASM : warning A4018:invalid command-line option : -U_GLIBCXX_ASSERTIONS
```
2024-07-05 21:16:04 -04:00
Hansang Bae
d4f3d24e7f
[OpenMP] Add ompt_start_tool declaration in omp-tools.h (#97099)
The function ompt_start_tool is a globally-visible C function according
to the specification.
2024-07-03 12:59:34 -05:00
Joachim
a707d0883b
[OpenMP][OMPT] Indicate loop schedule for worksharing-loop events (#97429)
Use more specific values from `ompt_work_t` to allow the tool identify
the schedule of a worksharing-loop. With this patch, the runtime will
report the schedule chosen by the runtime rather than necessarily the
schedule literally requested by the clause.
E.g., for guided + just one iteration per thread, the runtime would
choose and report static.

Fixes issue #63904
2024-07-03 09:33:19 +02:00
Gheorghe-Teodor Bercea
f0567702aa
[OpenMP] Add missing export for dynamic tracking patch (#97419)
Add missing export for OpenMP non-offloading builds.
2024-07-02 10:09:27 -04:00
dhruvachak
946f5d111d
[OpenMP] [OMPT] Callback registration should not depend on the device init callback. (#96371)
Even if the device init callback is not registered, a tool should be
allowed to register other callbacks.
2024-07-01 10:07:05 -07:00
Gheorghe-Teodor Bercea
1a478a69bc
[OpenMP][offload] Fix dynamic schedule tracking (#97065)
This patch fixes the dynamic schedule tracking.
2024-07-01 10:23:11 -04:00
Terry Wilmarth
ac9f06c2a8
[OpenMP] Fix test omp_parallel_num_threads_list.c to require fewer threads. (#96916)
Original test case used too many threads for some environments. This update
reduces to a max of 36 threads.
2024-06-28 00:17:11 +03:00
Terry Wilmarth
d30b082fd4
[OpenMP] Add num_threads clause list format and strict modifier support (#85466)
Add support to the runtime for 6.0 spec features that allow num_threads
clause to take a list, and also make use of the strict modifier.
Provides new compiler interface functions for these features.
2024-06-24 15:39:18 -04:00
Jonathan Peyton
88dae3d5d0
[OpenMP][libomp] Remove Perl in favor of Python (#95307)
* Removes all Perl scripts and modules
* Adds Python3 scripts which mimic the behavior of the Perl scripts
* Removes Perl from CMake; Adds Python3 requirement to CMake
* The check-instruction-set.pl script is Knights Corner specific. The
script is removed and not replicated with a corresponding Python3
script.

Relevant Discourse:

https://discourse.llvm.org/t/error-compiling-clang-with-offloading-support/79223/4

Fixes: https://github.com/llvm/llvm-project/issues/62289
2024-06-20 12:54:49 -05:00
Joachim
2464f1cef3
[OpenMP][OMPT] Add missing callbacks for asynchronous target tasks (#93472)
- The first hidden-helper-thread did not trigger thread-begin
- The "detaching" from a target-task when waiting for completion missed
to call task-switch
- Target tasks identified themself as explicit task

Co-authored-by: Kaloyan Ignatov <kaloyan.ignatov@rwth-aachen.de>
2024-06-04 17:07:15 +02:00
Shilei Tian
b448efb8ea
Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)" (#94139) 2024-06-03 11:17:36 -04:00
Joseph Huber
df9701bfee
[OpenMP] Fix multiply installing libomp.so (#93685)
Summary:
The `add_llvm_library` interface handles installing the llvm libraries,
however we want to do our own handling. Otherwise, this will install
into the `./lib` location instead of the `./lib/<target>` one.
2024-05-29 08:57:16 -05:00
Shilei Tian
cf9eeb67e5 Revert "Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)""
This reverts commit 7b4865582299294455bc816358fd88a9c6e5e0be.
2024-05-26 01:04:39 -04:00
Shilei Tian
7b48655822 Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311)"
This reverts commit 9b31cc71d66064dfaf2afabf4a835211321bb4a0.
2024-05-26 00:57:50 -04:00
Michael Kruse
8bdc577667
[openmp] Revise IDE folder structure (#89750)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-05-25 17:34:28 +02:00
Joseph Huber
9b31cc71d6 Revert "[OpenMP][OMPX] Add shfl_down_sync (#93311)"
This reverts commit 098c6dfa8157681699a71fce9e3d94515e66311f.
This reverts commit 8c718a3a91df4ab68dc3f1ca3887ea730c9aed84.
This reverts commit 4fb02de9d490d0773441aa30124bb4d1272230d3.
2024-05-24 19:07:53 -05:00
Shilei Tian
098c6dfa81 [NFC][OpenMP][OMPX] Remove ';' that is outside of a function 2024-05-24 14:21:54 -04:00
Shilei Tian
8c718a3a91 [OpenMP][OMPX] No default argument for C API 2024-05-24 14:15:50 -04:00
Shilei Tian
4fb02de9d4
[OpenMP][OMPX] Add shfl_down_sync (#93311) 2024-05-24 14:00:43 -04:00
Shilei Tian
7eeec8e6d1
[OpenMP][OMPX] Add ballot_sync (#91297)
This patch adds the support for `ballot_sync` in ompx.
2024-05-24 09:54:54 -04:00
Michael Kruse
9120562dfc
[Clang][OpenMP] Enable tile/unroll on iterator- and foreach-loops (#91459)
OpenMP loop transformation did not work on a for-loop using an iterator
or range-based for-loops. The first reason is that it combined the
iterator's type for generated loops with the type of `NumIterations` as
generated for any `OMPLoopBasedDirective` which is an integer. Fixed by
basing all generated loop variables on `NumIterations`.

Second, C++11 range-based for-loops include syntactic sugar that needs
to be executed before the loop. This additional code is now added to the
construct's Pre-Init lists.

Third, C++20 added an initializer statement to range-based for-loops
which is also added to the pre-init statement. PreInits used to be a
`DeclStmt` which made it difficult to add arbitrary statements from
`CXXRangeForStmt`'s syntactic sugar, especially the for-loops init
statement which does not need to be a declaration. Change it to be a
general `Stmt` that can be a `CompoundStmt` to hold arbitrary Stmts,
including DeclStmts. This also avoids the `PointerUnion` workaround used
by `checkTransformableLoopNest`.

End-to-end tests are added to verify the expected number and order of
loop execution and evaluations of expressions (such as iterator
dereference). The order and number of evaluations of expressions in
canonical loops is explicitly undefined by OpenMP but checked here for
clarification and for changes to be noticed.
2024-05-22 14:30:31 +02:00
Joseph Huber
f60c699d37 [OpenMP] Fix intermediate header locations for OpenMP
Summary:
A previous patch moved the code here and accidentally overrwrote the
include path that the LSP interface used. This caused incorrect errors
when using clangd with the offload project. This patch removes the
unnecessary header and makes sure we include the correct folder.
2024-05-15 20:45:19 -05:00
Michael Kruse
b0b6c16b47
[Clang][OpenMP][Tile] Allow non-constant tile sizes. (#91345)
Allow non-constants in the `sizes` clause such as
```
#pragma omp tile sizes(a)
for (int i = 0; i < n; ++i)
```
This is permitted since tile was introduced in [OpenMP
5.1](https://www.openmp.org/spec-html/5.1/openmpsu53.html#x78-860002.11.9).

It is possible to sneak-in negative numbers at runtime as in
```
int a = -1;
#pragma omp tile sizes(a)
```
Even though it is not well-formed, it should still result in every loop
iteration to be executed exactly once, an invariant of the tile
construct that we should ensure. `ParseOpenMPExprListClause` is
extracted-out to be reused by the `permutation` clause of the
`interchange` construct. Some care was put into ensuring correct behavior
in template contexts.
2024-05-13 16:10:58 +02:00
Xing Xue
561b6ab96e
[OpenMP][AIX] Implement __kmp_get_load_balance() for AIX (#91520)
AIX has the `/proc` filesystem where `/proc/<pid>/lwp/<tid>/lwpsinfo` has
the thread state in binary, similar to Linux's
`/proc/<pid>/task/<tid>/stat` where the state is in ASCII. However, the
definition of state info `R` in `lwpsinfo` is `runnable`. In Linux,
state `R` means the thread is `running`. Therefore, `lwpsinfo` is not
ideal for our purpose of getting the current load of the system. This
patch uses `perfstat_cpu()` in AIX system library `libperfstat.a` to
obtain the number of threads current running on logical CPUs.
2024-05-10 09:23:02 -04:00
chandan singh
2a57657d55
[OpenMP] [Flang] Resolved Issue llvm#76121: Implemented Check for Unhandled Arguments in __kmpc_fork_call_if (#82221)
Root cause: Segmentation fault is caused by null pointer dereference
inside the __kmpc_fork_call_if function at
https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_asm.S#L1186
. __kmpc_fork_call_if is missing case to handle argc=0 .

Fix: Added a check inside the __kmp_invoke_microtask function to handle
the case when argc is 0.

---------

Co-authored-by: Singh <chasingh@amd.com>
2024-05-09 11:11:04 +05:30
Jonathan Peyton
73bb8d9d92
[OpenMP] Fix child processes to use affinity_none (#91391)
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.

Fixes: #91098
2024-05-08 09:23:50 -05:00
Jonathan Peyton
41ca9104ac
[OpenMP] Fix task state and taskteams for serial teams (#86859)
* Serial teams now use a stack (similar to dispatch buffers)
* Serial teams always use `t_task_team[0]` as the task team and the
second pointer is a next pointer for the stack

`t_task_team[1]` is interpreted as a stack of task teams where each
level is a nested level

```
 inner serial team                   outer serial team
[ t_task_team[0] ] -> (task_team)    [ t_task_team[0] ] -> (task_team)
[ next           ] ----------------> [ next           ] -> ...
```

* Remove the task state memo stack from thread structure.
* Instead of a thread-private stack, use team structure to store
th_task_state of the primary thread. When coming out of a parallel,
restore the primary thread's task state. The new field in the team
structure doesn't cause sizeof(team) to change and is in the cache line
which is only read/written by the primary thread.

Fixes: #50602
Fixes: #69368
Fixes: #69733
Fixes: #79416
2024-05-07 08:41:51 -05:00
Shilei Tian
02ce8227ac [NFC][OpenMP][OMPX] Move declare variant up 2024-05-06 23:46:18 -04:00