565 Commits

Author SHA1 Message Date
Joachim Protze
1dc2afdcaf [OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP threads
When the current thread is not an (initialized) OpenMP thread, the runtime
entry points return values that correspond to "not available" or similar

Differential Revision: https://reviews.llvm.org/D41167

llvm-svn: 322620
2018-01-17 10:05:55 +00:00
Andrey Churbanov
5388acd3de Fixed libomp static build broken by the commit rL322202.
Patch by simone <simone@cs.utah.edu>.

Differential Revision: https://reviews.llvm.org/D41945

llvm-svn: 322282
2018-01-11 15:09:49 +00:00
Jonathan Peyton
79390ad709 Force HWLOC topology method for NUMA-specific topology
If user requested affinity with granularity=tile we need to either use HWLOC
or ignore the request. The change allows user to not specify
KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D40905

llvm-svn: 322205
2018-01-10 18:31:49 +00:00
Jonathan Peyton
1800ecec70 Simplify __kmp_expand_threads
This change simplifies __kmp_expand_threads to take a single argument.
Previously, it allowed two arguments and had logic to decide on different
potential expansion sizes. However, no calls to __kmp_expand_threads in the
runtime make use of this extra logic. Thus the extra argument and logic is
removed here.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41836

llvm-svn: 322204
2018-01-10 18:27:01 +00:00
Jonathan Peyton
bff8ded906 Minor code cleanup
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41831

llvm-svn: 322203
2018-01-10 18:24:09 +00:00
Jonathan Peyton
eaa9e40c9a Improve stability of the runtime in parent/child processes
This change improves stability of the runtime when the application forks child
processes.  Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the
atfork handlers insures that the actual fork does not occur while those two
locks are held, and __kmp_itt_reset() reverts the itt's global state to the
initial state which also initializes the mutex stored in the global state.
Some missing initialization code was also inserted in the child's atfork handler.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D41462

llvm-svn: 322202
2018-01-10 18:21:48 +00:00
Joachim Protze
14b512e20c [OMPT] Fix ompt_task_data handling in implicit barriers
Changes to task_data in barrier-begin were not visible at barrier-end

Differential Revision: https://reviews.llvm.org/D41176

llvm-svn: 322178
2018-01-10 12:51:27 +00:00
Paul Osmialowski
6db41e608f Fix type mismatch in omp_control_tool() implementation that makes it run incorrectly on 32-bit machines.
Differential Revision: https://reviews.llvm.org/D41854

llvm-svn: 322068
2018-01-09 10:54:06 +00:00
Jonas Hahnfeld
3ffca790f6 Correct types of pointers to doacross_num_done
This field is defined as kmp_int32, so we should use neither
pointers to kmp_int64 nor 64 bit atomic instructions.
(Found while testing on a Raspberry Pi, 32 bit ARM)

Differential Revision: https://reviews.llvm.org/D41656

llvm-svn: 321964
2018-01-07 16:54:36 +00:00
Jonathan Peyton
97f4320086 Fix some comments and formatting in kmp_dispatch.cpp
llvm-svn: 321831
2018-01-04 23:05:26 +00:00
Jonathan Peyton
8c432f2d5e Fix trademarks found by scanner
llvm-svn: 321827
2018-01-04 22:56:47 +00:00
Joachim Protze
265fb584a5 [OMPT] Set and reset frame address when creating a task with dependences
As for normal task creation, the task frame addresses need to be stored
for the encountering task.

Differential Revision: https://reviews.llvm.org/D41165

llvm-svn: 321421
2017-12-24 07:30:23 +00:00
Joachim Protze
25aa3ec1c5 Remove unused positional argument for printf
The format string for hints only prints the second argument (string) and drops
the first argument (hint id). Depending on how you read the POSIX text for
printf, this could be valid. But for practical reason, i.e., unpacking the
va_list passed to printf based on the formating information, it makes sense
to fix the implementation and not pass the id for hint.

Failing testcases were:

misc_bugs/teams-reduction.c
ompt/parallel/not_enough_threads.c

Differential Revision: https://reviews.llvm.org/D41504

llvm-svn: 321361
2017-12-22 16:40:26 +00:00
Joachim Protze
f375f4b49a [OMPT] Add missing ompt_get_num_procs function
This function is defined in OpenMP-TR6 section 4.1.5.1.6
The functions was not implemented yet.

Since ompt-functions can only be called after the runtime was initialized and
has loaded a tool, it can assume the runtime to be initialized. In contrast
to omp_get_num_procs which needs to check whether the runtime is initialized.

Differential Revision: https://reviews.llvm.org/D40949

llvm-svn: 321269
2017-12-21 14:36:30 +00:00
Joachim Protze
f8d22f9db8 [OMPT] Fix return address handling in a few GOMP interface methods
This revision fixes failing testcases with parallel for loops and the gomp
interface. The return address needs to be stored at entry to runtime.
The storage is cleared on usage, so we need to update the storage before
calling again internal functions, that will trigger event callbacks.

Differential Revision: https://reviews.llvm.org/D41181

llvm-svn: 321265
2017-12-21 13:55:39 +00:00
Joachim Protze
4fe83593eb [OMPT] Handle null pointer in set_callback to improve performance
We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the
vector of callback functions when testing if specific code should be executed.
Before invoking an event callback function, the pointer is tested for NULL.

This revision resets the corresponding bit in ompt_enabled to 0 if
NULL is passed in set_callback.

Differential Revision: https://reviews.llvm.org/D41171

llvm-svn: 321264
2017-12-21 13:55:34 +00:00
Paul Osmialowski
7634f7093a [AArch64] fix an issue with older /proc/cpuinfo layout
There are two /proc/cpuinfo layots in use for AArch64: old and new.
The old one has all 'processor : n' lines in one section, hence
checking for duplications does not make sense.

Differential Revision: https://reviews.llvm.org/D41000

llvm-svn: 320593
2017-12-13 16:12:24 +00:00
Jonas Hahnfeld
e628ab4c65 Use hyperbarrier by default on all architectures
All architectures except x86_64 used the linear barrier implementation
by default which doesn't give good performance for a larger number
of threads.

Improvements for PARALLEL overhead (EPCC) with this patch on a Power8
system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores)

 20 threads:  4.55us -> 3.49us
 40 threads:  8.84us -> 4.06us
 80 threads: 19.18us -> 4.74us
160 threads: 54.22us -> 6.73us

Differential Revision: https://reviews.llvm.org/D40358

llvm-svn: 320152
2017-12-08 15:07:07 +00:00
Jonas Hahnfeld
ce528acf0d Fix thread affinity on non-x86 Linux
To make thread affinity work according to the OpenMP spec, the
runtime needs information about the hardware topology. On Linux
the default way is to parse /proc/cpuinfo which contains this
information for x86 machines but (at least) not for AArch64 and
Power architectures.

Fortunately, there is a different code path which is able to get
that data from sysfs. The needed patch has landed in 2006 for
Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had
a kernel version derived from 2.6.18, and we are now at RHEL 7!).

Differential Revision: https://reviews.llvm.org/D40357

llvm-svn: 320151
2017-12-08 15:07:05 +00:00
Jonas Hahnfeld
86c307821c Add missing memory barrier for queuing locks
Otherwise I see hangs in the omp_single_copyprivate test when
compiling in release mode. With the debug assertions, I get a
failure `head > 0 && tail > 0`.

Differential Revision: https://reviews.llvm.org/D40722

llvm-svn: 320150
2017-12-08 15:07:02 +00:00
Jonathan Peyton
ebbcb43976 [OpenMP] Add entry for Intel Compiler 18
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D40386

llvm-svn: 319961
2017-12-06 21:15:28 +00:00
Jonathan Peyton
125203e003 Eliminate double printing of verbose affinity settings
Redundant extra verbose output of binding to full mask in case
affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any>

Differential Revision: https://reviews.llvm.org/D40624

llvm-svn: 319960
2017-12-06 21:07:41 +00:00
Jonathan Peyton
ec5b87188d Trivial enum fix
This change is a trivial fix for enums that removes specification of "last" or
"upper" values, or other boundary values. This simplifies the code in places,
and results in never needing to update the "upper" values again.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40804

llvm-svn: 319957
2017-12-06 21:02:15 +00:00
Jonas Hahnfeld
a4ca525c1b Fix PR30890: Reduction across teams hangs
__kmpc_reduce_nowait() correctly swapped the teams for reductions
in a teams construct. Apply the same logic to __kmpc_reduce() and
__kmpc_reduce_end().

Differential Revision: https://reviews.llvm.org/D40753

llvm-svn: 319788
2017-12-05 16:51:24 +00:00
Andrey Churbanov
a5868215b4 Extension of HWLOC topology discovery with NUMA nodes and tiles
Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40309

llvm-svn: 319422
2017-11-30 11:51:47 +00:00
Jonathan Peyton
ba55a7b958 Make kmp_r_sched_t into a union
This change makes kmp_r_sched_t type into a union for simpler
comparisons and assignments

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40374

llvm-svn: 319379
2017-11-29 22:47:52 +00:00
Jonathan Peyton
62da55020b Fix aligned memory allocation in the stub library
kmp_aligned_malloc() always returned NULL on Windows (stub library only)
that may cause Fortran application crash.  With this change all memory
allocation functions were fixed to use aligned{m,re,rec}alloc() to
allocate/reallocate memory. To deallocate that memory _aligned_free() is
used in kmp_free().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40296

llvm-svn: 319375
2017-11-29 22:29:38 +00:00
Jonathan Peyton
64249504b5 Warning is emitted when tiles are requested but cannot be used
Added two warnings:
1) Before building the topology map check if tiles are requested but the
   topo method is not hwloc;
2) After building the topology map check if tiles are requested but not
   detected by the library.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40340

llvm-svn: 319374
2017-11-29 22:27:18 +00:00
Jonathan Peyton
92ce4bcfd8 Fix types of Fortran array elements
Fortran array elements made default integer in OMP_GET_PLACE_PROC_IDS and
OMP_GET_PARTITION_PLACE_NUMS subroutines, otherwise call to them produces
incorrect result.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40356

llvm-svn: 319372
2017-11-29 22:23:44 +00:00
Jonas Hahnfeld
5af381acad [CMake] Refactor common settings and flags
These are needed by both libraries, so we can do that in a
common namespace and unify configuration parameters.
Also make sure that the user isn't requesting libomptarget
if the library cannot be built on the system. Issue an error
in that case.

Differential Revision: https://reviews.llvm.org/D40081

llvm-svn: 319342
2017-11-29 19:31:48 +00:00
Jonas Hahnfeld
3e921d3c52 [CMake] Disallow direct configuration
As a first step, this allows us to generalize the detection of
standalone builds and make it fully compatible when building in
llvm/runtimes/ which automatically sets OPENMP_STANDLONE_BUILD.

Differential Revision: https://reviews.llvm.org/D40080

llvm-svn: 319341
2017-11-29 19:31:43 +00:00
Jonas Hahnfeld
221e7bb1fc Fix for OMP doacross implementation on Power
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.

Differential Revision: https://reviews.llvm.org/D40175

llvm-svn: 318848
2017-11-22 17:15:20 +00:00
Andrey Churbanov
58acafc424 Fixed OMP doacross implementation on 32-bit platforms.
Differential Revision: https://reviews.llvm.org/D40171

llvm-svn: 318658
2017-11-20 16:00:42 +00:00
Andrey Churbanov
a756cb240a Exclude untied tasks from checking of task scheduling constraint (TSC).
This can improve performance of tests with untied tasks.

Differential Revision: https://reviews.llvm.org/D39613

llvm-svn: 318388
2017-11-16 10:45:07 +00:00
Jonas Hahnfeld
d0ef19ef9b [OMPT] Provide initialization for Mac OS X
Traditionally, the library had a weak symbol for ompt_start_tool()
that served as fallback and disabled OMPT if called. Tools could
provide their own version and replace the default implementation
to register callbacks and lookup functions. This mechanism has
worked reasonably well on Linux systems where this interface was
initially developed.

On Darwin / Mac OS X the situation is a bit more complicated and
the weak symbol doesn't work out-of-the-box. In my tests, the
library with the tool needed to link against the OpenMP runtime
to make the process work. This would effectively mean that a tool
needed to choose a runtime library whereas one design goal of the
interface was to allow tools that are agnostic of the runtime.

The solution is to use dlsym() with the argument RTLD_DEFAULT so
that static implementations of ompt_start_tool() are found in the
main executable. This works because the linker on Mac OS X includes
all symbols of an executable in the global symbol table by default.
To use the same code path on Linux, the application would need to
be built with -Wl,--export-dynamic. To avoid this restriction, we
continue to use weak symbols on Linux systems as before.

Finally this patch extends the existing test to cover all possible
ways of initializing the tool as described by the standard. It
also fixes ompt_finalize() to not call omp_get_thread_num() when
the library is shut down which resulted in hangs on Darwin.
The changes have been tested on Linux to make sure that it passes
the current tests as well as the newly extended one.

Differential Revision: https://reviews.llvm.org/D39801

llvm-svn: 317980
2017-11-11 13:59:48 +00:00
Joachim Protze
91732475a6 [OMPT] Fix assertion for OpenMP code generated with outdated compilers
For up-to-date compilers, this assertion is reasonable, but it breaks
compatibility with the typical compiler installed on most systems.
This patch changes the default value to what we had when there was no
compiler support. A warning about the outdated compiler is printed during
runtime, when this point is reached.

Differential Revision: https://reviews.llvm.org/D39890

llvm-svn: 317928
2017-11-10 21:07:01 +00:00
Jonas Hahnfeld
e9b7c0a392 Add const to some variables to avoid const_casts
In these places the const attribute seems correct and doesn't
need any other change, so let's do it.

Differential Revision: https://reviews.llvm.org/D39756

llvm-svn: 317798
2017-11-09 15:52:29 +00:00
Jonas Hahnfeld
aeb40adabf Remove const from variables with dynamic memory
Allocated memory is typically not 'const' if it needs to be freed.
This patch removes around 50 wrong const attributes, modifies the
corresponding functions and finally gets rid of some const_casts.
These have especially been strange for __kmp_str_fname_free() that
added a 'const' to call __kmp_str_free() which removed it again.

Two minor cleanups that I performed in this process:
 * __kmp_tool_libraries now lives in kmp_settings.cpp as it is
   used nowhere else.
 * __kmp_msg_empty was removed as it was never used and Clang
   now complained that it was assigned a string literal that
   is 'const char *'.

Differential Revision: https://reviews.llvm.org/D39755

llvm-svn: 317797
2017-11-09 15:52:25 +00:00
Jonathan Peyton
40039ac98c Cleanup version symbol macros and attributes/declspecs
1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and
KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and
xversionify.

2) Put all attribute and __declspec definitions in kmp_os.h

Differential Revision: https://reviews.llvm.org/D39516

llvm-svn: 317636
2017-11-07 23:32:13 +00:00
Jonas Hahnfeld
13dc13ef09 [OMPT] Improve cast that was lost on commit, NFC.
llvm-svn: 317480
2017-11-06 14:33:09 +00:00
Joachim Protze
cab9cdc2ad Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317436
2017-11-05 14:11:19 +00:00
Joachim Protze
c255ca70ce Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317435
2017-11-05 14:11:10 +00:00
Jonas Hahnfeld
b71424fda5 Revert "Rename fields of ompt_frame_t"
This reverts commit r317338 which discarded some recent commits.

llvm-svn: 317347
2017-11-03 18:28:25 +00:00
Jonas Hahnfeld
f0a1c65fb0 Revert "Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)"
This reverts commit r317339 which discarded some recent commits.

llvm-svn: 317346
2017-11-03 18:28:19 +00:00
Joachim Protze
924cff0a39 Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317339
2017-11-03 17:09:00 +00:00
Joachim Protze
741572593f Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317338
2017-11-03 17:08:40 +00:00
Jonathan Peyton
3d18a37ca9 [OpenMP] Fix race condition in omp_init_lock
This is a partial fix for bug 34050.

This prevents callers of omp_set_lock (which does not hold __kmp_global_lock)
from ever seeing an uninitialized version of __kmp_i_lock_table.table.

It does not solve a use-after-free race condition if omp_set_lock obtains a
pointer to __kmp_i_lock_table.table before it is updated and then attempts to
dereference afterwards. That race is far less likely and can be handled in a
separate patch.

The unit test usually segfaults on the current trunk revision. It passes with
the patch.

Patch by Adam Azarchs

Differential Revision: https://reviews.llvm.org/D39439

llvm-svn: 317115
2017-11-01 19:44:42 +00:00
Joachim Protze
82e94a5934 Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).
The code is tested to work with latest clang, GNU and Intel compiler. The implementation
is optimized for low overhead when no tool is attached shifting the cost to execution with
tool attached.

This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D38185

llvm-svn: 317085
2017-11-01 10:08:30 +00:00
Jonathan Peyton
5e6cb9022c Fix fatal error message displaying
Replacing call to __kmp_msg(kmp_ms_fatal,...) with __kmp_fatal(...) caused an
issue when incomplete message is displayed in case an error message is followed
by another message (e.g. by a hint messa)ge. This is because __kmp_fatal()
passes incomplete list of arguments to __kmp_msg().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D39248

llvm-svn: 316623
2017-10-25 22:05:02 +00:00
Jonathan Peyton
dff0ee2f4e Disable threadprivate data cleanup if runtime is terminating
The problem is due to the runtime's threadprivate cleanup code which tries to
access data that was already destroyed by one of the root threads.
__kmp_init_gtid is used as a checker here since it is set to false before actual
resource cleanup is done in __kmp_cleanup().

Patch by Hansang Bae

llvm-svn: 316452
2017-10-24 16:10:09 +00:00