* Replace HBWMALLOC API with more general MEMKIND API, new functions
and variables added.
* Have libmemkind.so loaded when accessible.
* Redirect memspaces to default one except for high bandwidth which
is processed separately.
* Ignore some allocator traits e.g., sync_hint, access, pinned, while
others are processed normally e.g., alignment, pool_size, fallback,
fb_data, partition.
* Add tests for memory management
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D59783
llvm-svn: 357929
This patch cleans up the bookkeeping code for the load balancing dynamic mode.
When a thread is moved to or from the thread pool, the th_active_in_pool flag
and the __kmp_thread_pool_active_nth global counter are both updated. This
removes the need for the corrective code in the main wait loop. Another global
counter, __kmp_thread_pool_nth, was removed completely, as it was only used for
debugging, but was not under KMP_DEBUG.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D59508
llvm-svn: 357927
Debug dump on large machine shows when many OpenMP threads (401 in total)
sleep on a barrier, one of the innermost nesting levels sleeps
on a child's b_arrived flag whose value is equal to 4 and is equal to
checker value. i.e., (1) sleep bit is 0, and (2) done_check() would
return true if called.
It is unclear how this might happen. It could be Windows Server 2016's
error of EnterCriticalSection / LeaveCriticalSection, or
error of WaitForSingleObject / SetEvent / ResetEvent, or
error in the library which is very difficult to find.
As a workaround, change INFINITE wait to timed wait, so that each
thread awakens each 5 seconds (the timeout was chosen arbitrary to not
disturb other threads much), check flag condition under the lock, and
either go to sleep again or stop sleeping as a result of the check.
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D59793
llvm-svn: 357722
The distribute clause needs an explicit push of a timer. The teams
clause needs a timer added and also, similarly to parallel, exchanged
with the serial timer when encountered so that serial regions are
counted properly.
Differential Revision: https://reviews.llvm.org/D59801
llvm-svn: 357621
Summary:
[Split off from D59451 to get this fix in separately]
While building the 8.0 releases on FreeBSD, I encountered the following
warnings in openmp quite a few times:
```
In file included from projects/openmp/runtime/src/kmp_settings.cpp:27:
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: #warning is a language extension [-Wpedantic]
#warning No gettid found, use getpid instead
^
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: No gettid found, use getpid instead [-W#warnings]
2 warnings generated.
```
I added a gettid wrapper that uses FreeBSD's pthread_getthreadid_np(3)
function for this.
Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim
Reviewed By: jlpeyton
Subscribers: jfb, jdoerfert, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D59735
llvm-svn: 356934
The following GCC intrinsics are not available on MIPS32:
__sync_fetch_and_add_8
__sync_fetch_and_and_8
__sync_fetch_and_or_8
__sync_val_compare_and_swap_8
Replace these with appropriate libatomic implementation.
Patch by Miodrag Dinic.
Differential Revision: https://reviews.llvm.org/D45691
llvm-svn: 355687
This change makes the runtime decide the intended use of each barrier
invocation, for the OMPT synchronization tool callbacks. The OpenMP 5.0
specification defines four possible barrier kinds -- implicit, explicit,
implementation, and just normal barrier.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D58247
llvm-svn: 355140
Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been
deprecated in the 5.0 spec. Initial nesting info is now derived from
OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND.
This patch deprecates the internal ICV that corresponds to nest-var, and
replaces it with the max-active-levels-var ICV to determine nesting. The
change still allows for use of OMP_NESTED (according to 5.0 changes),
omp_get_nested, and omp_set_nested, which have had deprecation messages
added to them. The change allows certain settings of OMP_NUM_THREADS,
OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but
OMP_NESTED=0 will still force nesting to be off.
The runtime now prints informative messages about deprecation of
OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those
environment variables or routines are used. It also prints deprecated
message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED.
This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58408
llvm-svn: 355138
This patch cleans up the yielding code and makes it optional. An
environment variable, KMP_USE_YIELD, was added. Yielding is still
on by default (KMP_USE_YIELD=1), but can be turned off completely
(KMP_USE_YIELD=0), or turned on only when oversubscription is detected
(KMP_USE_YIELD=2). Note that oversubscription cannot always be detected
by the runtime (for example, when the runtime is initialized and the
process forks, oversubscription cannot be detected currently over
multiple instances of the runtime).
Because yielding can be controlled by user now, the library mode
settings (from KMP_LIBRARY) for throughput and turnaround have been
adjusted by altering blocktime, unless that was also explicitly set.
In the original code, there were a number of places where a double yield
might have been done under oversubscription. This version checks
oversubscription and if that's not going to yield, then it does
the spin check.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58148
llvm-svn: 355120
This patch adds the new 5.0 API function omp_get_supported_active_levels().
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58211
llvm-svn: 354368
Remove fatal error messages from the cancellation API for GOMP
Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions.
This new function uses the linear barrier algorithm with a cancellable
nonsleepable wait loop.
Differential Revision: https://reviews.llvm.org/D57969
llvm-svn: 354367
The thread-limit-var and omp_get_thread_limit API was not perfectly handled for
teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit
reports the correct value. In addition, the value is restored when leaving the
teams construct to what it was in the encountering context.
This is done partly by creating the notion of a Contention Group root (CG root)
that keeps track of the thread at the root of each separate CG, the
thread-limit-var associated with the CG, and associated counter of active
threads within the contention group.
thread-limits are passed from master to worker threads via an entry in the ICV
data structure. When a "contention group switch" occurs, a new CG root record is
made and passed from master to worker. A thread could potentially have several
CG root records if it encounters multiple nested teams constructs (but at the
moment the spec doesn't allow for nested teams, so the most one could have
currently is 2). The master of the teams masters gets the thread-limit clause
value stored to its local ICV structure, and the other teams masters copy it
from the master. The thread-limit is set from that ICV copy and restored to the
ICV copy when entering and leaving the teams construct.
This change also fixes a bug when the top-level teams construct team gets
reused, and OMP_DYNAMIC was true, which can cause the expected size of this team
to be smaller than what was actually allocated. The fix updates the size of the
team after its threads were reserved.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D56804
llvm-svn: 353747
Summary:
As @david2050 commented, changes introduced by https://reviews.llvm.org/D56397 break builds for older compilers
which don't support `__has(_cpp)_attribute`. This is a fix for the break.
Reviewers: protze.joachim, jlpeyton, AndreyChurbanov, Hahnfeld, david2050
Subscribers: openmp-commits, david2050
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D57851
llvm-svn: 353538
The three switch fallthrough generate a warning with -Wimplicit-fallthrough.
Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp.
Not sure whether kmp.h is the best place to define the macro.
Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld
Reviewed By: jlpeyton
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D56397
llvm-svn: 353052
Redo after revert by hans. The wrong include in one test is fixed.
Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.
Patch provided by @sconvent
Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
Reviewed By: joachim.protze
Tags: #openmp, #ompt
Differential Revision: https://reviews.llvm.org/D47717
llvm-svn: 352611
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351648
and also the follow-up r351315.
The new test is failing on the buildbots.
> Make sure that OMPT is enabled in runtime entry points that access internals
> of the runtime. Else, return an appropiate value indicating an error or that
> the data is not available.
>
> Patch provided by @sconvent
>
> Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
>
> Reviewed By: joachim.protze
>
> Tags: #openmp, #ompt
>
> Differential Revision: https://reviews.llvm.org/D47717
llvm-svn: 351431
Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for
internal implementation. Implemented callable helper function to do local pause,
and added basic functionality for hard and soft pause.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D55078
llvm-svn: 351372
The compiler warns about an unused variable/statement:
runtime/src/kmp_affinity.cpp:4958:18: warning: statement has no effect [-Wunused-value]
KA_TRACE(1000, ; {
^
runtime/src/kmp_debug.h:84:24: note: in definition of macro 'KA_TRACE'
__kmp_debug_printf x; \
^
Instead of the unused reference to this function, this patch now calls the function
with an empty string. The call to this function should have no effect.
Patch provided by joachim.protze
Reviewers: jlpeyton, hbae, AndreyChurbanov
Reviewed By: AndreyChurbanov
Tags: #openmp, #ompt
Differential Revision: https://reviews.llvm.org/D56775
llvm-svn: 351323
Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.
Patch provided by @sconvent
Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
Reviewed By: joachim.protze
Tags: #openmp, #ompt
Differential Revision: https://reviews.llvm.org/D47717
llvm-svn: 351311
Using proc_bind clause on a nested #pragma omp parallel region
with KMP_AFFINITY set causes an assertion error. This assertion occurs because
the place-partition-var is not properly initialized in the nested master threads.
Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult
because of how the KMP_AFFINITY gtid-to-place mapping occurs. This
patch creates an initial place list no matter what affinity mechanism is used.
For KMP_AFFINITY, the place-partition-var is initialized to all the places.
Differential Revision: https://reviews.llvm.org/D55795
llvm-svn: 351227
This change fixes the sanity issue reported in Bug 40042.
Lock function definitions for the three lock kinds were added
to disambiguate calls to the lock functions done directly and indirectly.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D56103
llvm-svn: 351224
Make __ompt_implicit_task_end a static function and remove the inline part. Remove
pId variable that is unused. This fixes small regression in SPEC kdtree benchmark.
Also reformat some of __ompt_implicit_task_end.
Differential Revision: https://reviews.llvm.org/D55788
llvm-svn: 351221
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface
is implemented as specified.
The other changes are necessary to update the interface implementation to the
final version as published in 5.0.
The omp-tools.h header was previously called ompt.h, currently a copy under this name
is installed for legacy tools.
Patch partially perpared by @sconvent
Reviewers: AndreyChurbanov, hbae, Hahnfeld
Reviewed By: hbae
Tags: #openmp, #ompt
Differential Revision: https://reviews.llvm.org/D55579
llvm-svn: 351197
Summary:
Two things:
1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning.
2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off,
thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`.
Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly.
It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block..
I *think* this is the only source file with this issue,
everything else seem to `#include` either `kmp.h` or `kmp_config.h`.
The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake.
I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]].
Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld
Reviewed By: jlpeyton
Subscribers: guansong, jfb, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D55783
llvm-svn: 351019
Add omp_get_device_num() function for 5.0 which returns the number of the
device the current thread is running on. Currently, we are leaving it to the
compiler to handle this properly if it is called inside target.
Also, did some cleanup and updating of duplicate device API functions (in both
libomp and libomptarget) to make them into weak functions that check for the
symbol from libomptarget, and will call the version in libomptarget if it is
present. If any additional device API functions are implemented also in
libomptarget in the future, we should add the dlsym calls to the host functions.
Also, if the omp_target_* functions are to be implemented for the host (this has
been requested), they should attempt to call the libomptarget versions as well.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D55578
llvm-svn: 350352
This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t
and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t"
for these types.
Furthermore the structure for ompt_frame_t changed and allows to specify
that the reenter frame belongs to the runtime.
Patch partially prepared by Simon Convent
Reviewers: hbae
llvm-svn: 349458
Summary:
I have discovered this because i wanted to experiment with
building static libomp (with openmp-4.0 support only)
for debugging purposes.
There are three kinds of problems here:
1. `__kmp_compare_and_store_acq()` simply does not exist.
It was added in D47903 by @jlpeyton.
I'm guessing `__kmp_atomic_compare_store_acq()` was meant.
2. In `__kmp_is_ticket_lock_initialized()`,
`lck->lk.initialized` is `std::atomic<bool>`,
while `lck` is `kmp_ticket_lock_t *`.
Naturally, they can't be equality-compared.
Either, it should return the value read from `lck->lk.initialized`,
or do what `__kmp_is_queuing_lock_initialized()` does,
compare the passed pointer with the field in the struct
pointed by the pointer. I think the latter is correct-er choice here.
3. Tests were not versioned.
They assume that `LIBOMP_OMP_VERSION` is at the latest version.
This does not touch LIBOMP_OMP_VERSION=30. That is still broken.
Reviewers: jlpeyton, Hahnfeld, AndreyChurbanov
Reviewed By: AndreyChurbanov
Subscribers: guansong, jfb, openmp-commits, jlpeyton
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D55496
llvm-svn: 349260
The value returned by __kmp_now_nsec() can overflow 32-bit values causing
incorrect values to be returned. The overflow can end up causing a divide
by zero error because in __kmp_initialize_system_tick(), the value
(__kmp_now_nsec() - nsec) can end up being much larger than the numerator:
1e6 * (delay + (now - goal))
during a pathological timing where the current time calculated is much larger
than nsec. When this happens, the value of __kmp_ticks_per_msec is set to zero
which is then used as the denominator in the KMP_NOW_MSEC() macro leading to
the divide by zero error.
Differential Revision: https://reviews.llvm.org/D55300
llvm-svn: 349090
This patch adds the affinity format functionality introduced in OpenMP 5.0.
This patch adds: Two new environment variables:
OMP_DISPLAY_AFFINITY=TRUE|FALSE
OMP_AFFINITY_FORMAT=<string>
and Four new API:
1) omp_set_affinity_format()
2) omp_get_affinity_format()
3) omp_display_affinity()
4) omp_capture_affinity()
The affinity format functionality has two ICV's associated with it:
affinity-display-var (bool) and affinity-format-var (string).
The affinity-display-var enables/disables the functionality through the
envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted
string with the special field types beginning with a '%' character
similar to printf
For example, the affinity-format-var could be:
"OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}"
The affinity-format-var is displayed by every thread implicitly at the beginning
of a parallel region when any thread's affinity has changed (including a brand
new thread being spawned), or explicitly using the omp_display_affinity() API.
The omp_capture_affinity() function can capture the affinity-format-var in a
char buffer. And omp_set|get_affinity_format() allow the user to set|get the
affinity-format-var explicitly at runtime. omp_capture_affinity() and
omp_get_affinity_format() both return the number of characters needed to hold
the entire string it tried to make (not including NULL character). If not
enough buffer space is available,
both these functions truncate their output.
Differential Revision: https://reviews.llvm.org/D55148
llvm-svn: 349089
Disable KMP_HAVE_QUAD when building via gcc on NetBSD system,
as the build fails due to unimplemented builtins:
.../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_mul':
.../kmp_atomic.cpp:1332: undefined reference to `__multc3'
.../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_div':
.../kmp_atomic.cpp:1334: undefined reference to `__divtc3'
...
Differential Revision: https://reviews.llvm.org/D55478
llvm-svn: 348886
Switch NetBSD from reading /proc (which is broken) to getloadavg()
(which is already used by Darwin). NetBSD discourages using procfs
in favor of system API calls.
Differential Revision: https://reviews.llvm.org/D55486
llvm-svn: 348885
Summary:
Use the sysctl(3) function to check whether an address is mapped
into the address space.
Reviewers: mgorny, joerg, #openmp
Reviewed By: mgorny
Subscribers: openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D55549
llvm-svn: 348874
Summary: _lwp_self() returns current Thread Id in a numeric version on NetBSD.
Reviewers: joerg, mgorny, #openmp
Reviewed By: mgorny
Subscribers: llvm-commits, openmp-commits, #openmp
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D55497
llvm-svn: 348873
Fix two build issues:
1) Recent commit 348756 accidentally included Unix clang compilers
to use immintrin.h when only clang-cl should be using it leading
to the following error:
openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_
inline function '_xbegin' requires target feature 'rtm', but would be inlined into function
'__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm'
kmp_uint32 status = _xbegin();
This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang
2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp:
This patch just changes it to a two line comment
openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment]
#endif // KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD \
llvm-svn: 348783
Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance.
Reviewers: #openmp, krytarowski
Reviewed By: krytarowski
Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D34280
llvm-svn: 348726
Summary:
Additions mostly follow FreeBSD and NetBSD and are not intrusive.
There is similar patch for OpenBSD: https://reviews.llvm.org/D34280
The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port.
Simple OpenMP programs compile and work as expected:
$ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include
$ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out
The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches.
Reviewers: #openmp, krytarowski
Reviewed By: krytarowski
Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini
Differential Revision: https://reviews.llvm.org/D35129
llvm-svn: 348725
There is a conflict between libomptarget and libomp concerning some of the
standard OpenMP device API which needs further intestigation.
llvm-svn: 347932
This patch adds __kmpc_omp_reg_task_with_affinity to register affinity
information for tasks. For now, the affinity information is not used,
and the function always succeeds. This also adds the kmp_task_affinity_info_t
structure to store the task affinity information.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D55026
llvm-svn: 347907