59 Commits

Author SHA1 Message Date
Andrey Churbanov
5ba90c7979 OpenMP RTL cleanup: eliminated warnings with -Wcast-qual, patch 2.
Changes are: got all atomics to accept volatile pointers that allowed
to simplify many type conversions. Windows specific code fixed correspondingly.

Differential Revision: https://reviews.llvm.org/D35417

llvm-svn: 308164
2017-07-17 09:03:14 +00:00
Andrey Churbanov
c47afcd9bb OpenMP RTL cleanup: eliminated warnings with -Wcast-qual.
Changes are: replaced C-style casts with cons_cast and reinterpret_cast;
type of several counters changed to signed; type of parameters of 32-bit and
64-bit AND and OR intrinsics changes to unsigned; changed files formatted
using clang-format version 3.8.1.

Differential Revision: https://reviews.llvm.org/D34759

llvm-svn: 307020
2017-07-03 11:24:08 +00:00
Jonathan Peyton
642688b632 Fix minor formatting issues
Some code was restructured to move it under KMP_DEBUG.  The rest is
formatting changes to fix some things broken by clang-format

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D33744

llvm-svn: 304438
2017-06-01 16:46:36 +00:00
Jonathan Peyton
e3e2aaf68d Fix for KMP_AFFINITY=disabled and KMP_TOPOLOGY_METHOD=hwloc
With these settings, the create_hwloc_map() method was being called causing an
assert(). After some consideration, it was determined that disabling affinity
explicitly should just disable hwloc as well. i.e., KMP_AFFINITY overrides
KMP_TOPOLOGY_METHOD. This lets the user know that the Hwloc mechanism is being
ignored when KMP_AFFINITY=disabled.

Differential Revision: https://reviews.llvm.org/D33208

llvm-svn: 304344
2017-05-31 20:35:22 +00:00
Jonathan Peyton
586849918b Fix for KMP_AFFINITY=respect with multiple processor groups
An assert() was being tripped when KMP_AFFINITY=respect + Multiple Processor
Groups. Let __kmp_affinity_create_proc_group_map() function be able to create
address2os object which contains a single group by deleting restriction that
process affinity mask must span multiple groups.

llvm-svn: 303101
2017-05-15 19:05:59 +00:00
Jonathan Peyton
3041982dd1 Clang-format and whitespace cleanup of source code
This patch contains the clang-format and cleanup of the entire code base. Some
of clang-formats changes made the code look worse in places. A best effort was
made to resolve the bulk of these problems, but many remain. Most of the
problems were mangling line-breaks and tabbing of comments.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D32659

llvm-svn: 302929
2017-05-12 18:01:32 +00:00
Jonathan Peyton
20e13d4a38 Fix Hwloc API Incompatibility
Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor
HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and
HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on
the older names when using an older Hwloc.

Differential Revision: https://reviews.llvm.org/D32496

llvm-svn: 301349
2017-04-25 19:04:07 +00:00
Andrey Churbanov
4a9a89241b KMP_HW_SUBSET extended with NUMA support when HWLOC enabled
Differential Revision: https://reviews.llvm.org/D31600

llvm-svn: 300220
2017-04-13 17:15:07 +00:00
Jonathan Peyton
16fd8fec76 Fix incorrect initial value of __kmp_affinity_type.
Affinity initialization code expects __kmp_affinity_type has the value
affinity_default by default, but the cleanup code does not properly set the
value back to affinity_default.  This may introduce some issues when multiple
roots are trying to initialize/uninitialize the runtime successively.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D31012

llvm-svn: 298313
2017-03-20 22:04:02 +00:00
Jonathan Peyton
3061e3e454 Printing OS thread id, when KMP_AFFINITY is set.
Patch by Vishakha Agrawal

Differential Revision: https://reviews.llvm.org/D28873

llvm-svn: 293315
2017-01-27 18:04:33 +00:00
Jonas Hahnfeld
c9a8a6c030 kmp_affinity: Fix check if specific bit is set
Clang 4.0 trunk warns:
warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]

This points to a potential bug if the code really wants to check if the single
bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the
least significant one), 'logical not' will return 0 which stays 0 after the
'bitwise and'.
To do this correctly we first need to evaluate the 'bitwise and'. In that case
it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1.

Differential Revision: https://reviews.llvm.org/D28599

llvm-svn: 291764
2017-01-12 11:39:04 +00:00
Jonathan Peyton
1cdd87adfd Introduce dynamic affinity dispatch capabilities
This set of changes enables the affinity interface (Either the preexisting
native operating system or HWLOC) to be dynamically set at runtime
initialization. The point of this change is that we were seeing performance
degradations when using HWLOC. This allows the user to use the old affinity
mechanisms which on large machines (>64 cores) makes a large difference in
initialization time.

These changes mostly move affinity code under a small class hierarchy:

KMPAffinity
  class Mask {}
KMPNativeAffinity : public KMPAffinity
  class Mask : public KMPAffinity::Mask
KMPHwlocAffinity
  class Mask : public KMPAffinity::Mask

Since all interface functions (for both affinity and the mask implementation)
are virtual, the implementation can be chosen at runtime initialization.

Differential Revision: https://reviews.llvm.org/D26356

llvm-svn: 286890
2016-11-14 21:08:35 +00:00
Jonathan Peyton
7c465a5f41 Fix bitmask upper bounds check
Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we
use the get_max_proc() function which can vary based on the operating system.
For example on Windows with multiple processor groups, it might be the case that
the highest bit possible in the bitmask is not equal to the number of hardware
threads on the machine but something higher than that.

Differential Revision: https://reviews.llvm.org/D24206

llvm-svn: 281245
2016-09-12 19:02:53 +00:00
Jonathan Peyton
e6abe52905 Move function into cpp file under KMP_AFFINITY_SUPPORTED guard.
When affinity isn't supported, __kmp_affinity_compact doesn't exist.  The
problem is that in kmp_affinity.h there is a function which uses it without the
proper KMP_AFFINITY_SUPPORTED guard around it.  The compiler was smart enough to
ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on
it, but I think it is cleaner to have it under the proper guard.  Since the
function is only used in the kmp_affinity.cpp file and there aren't any plans to
have it elsewhere.  I have moved it there.

llvm-svn: 280542
2016-09-02 20:54:58 +00:00
Jonathan Peyton
788c5d65e8 Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro.
llvm-svn: 280530
2016-09-02 19:37:12 +00:00
Andrey Churbanov
5bf494e73d Fixed x2APIC discovery for 256-processor architectures.
Mask for value read from ebx register returned by CPUID expanded to 0xFFFF.

Differential Revision: https://reviews.llvm.org/D23203

llvm-svn: 277825
2016-08-05 15:59:11 +00:00
Paul Osmialowski
ecbe2ea002 Make balanced affinity work on AArch64.
This patch enables balanced affinity on machines that do not have
hardware threads and have cores clustered into packages. In facts,
balacing algorithm could be generalized for any arrangement with
at least two levels of hierarchy (depth > 1).

Differential Revision: https://reviews.llvm.org/D22365

llvm-svn: 277212
2016-07-29 20:55:03 +00:00
Andrey Churbanov
cb28d6e3a0 D22136: Memory leaks fixed by adding missed __kmp_free() calls
llvm-svn: 274850
2016-07-08 14:40:20 +00:00
Jonathan Peyton
fd7cc42fed Improvements to process affinity mask setting
A couple improvements:
1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources.
2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case.

Patch by Andrey Churbanov.

Differential Revision: http://reviews.llvm.org/D21528

llvm-svn: 273278
2016-06-21 15:54:38 +00:00
Jonathan Peyton
bf35771bcc Change hwloc discovery algorithm to print topology only for accessible resources
Change hwloc discovery algorithm to print topology for only accessible
resources, and report uniformity correspondingly, similar to what other topology
discovery algorithms do. Fixes minor inconsistency in total topology reported
and resources used for threads binding in case hwloc used.

Patch by Andrey Churbanov.

Differential Revision: http://reviews.llvm.org/D21389

llvm-svn: 272952
2016-06-16 20:31:19 +00:00
Jonathan Peyton
72a8498e08 Fixed missing memory cleanup in __kmp_affinity_create_hwloc_map()
Cleanup: fixed missing memory cleanup in couple of corner cases. Fixes possible
memory leak in some corner cases

Patch by Andrey Churbanov

Differential Revision: http://reviews.llvm.org/D21355

llvm-svn: 272946
2016-06-16 20:14:54 +00:00
Jonathan Peyton
b9d28fbeb3 Deprecate KMP_PLACE_THREADS and rename as KMP_HW_SUBSET
Deprecate KMP_PLACE_THREADS and rename it to KMP_HW_SUBSET due to confusion
about its purpose and function among users.  KMP_HW_SUBSET is an environment
variable which allows users to easily pick a subset of the hardware topology to
use.  e.g., KMP_HW_SUBSET=30c,2t means use 30 cores, 2 threads per core.

Patch by Andrey Churbanov

Differential Revision: http://reviews.llvm.org/D21340

llvm-svn: 272937
2016-06-16 18:53:48 +00:00
Jonathan Peyton
c5304aa3c4 Affinity mask processing improvements
Remove static specifier from var fullMask and remove kmp_get_fullMask() routine.
When iterating through procs in a mask, always check if proc is in fullMask
(this check was missing in a few places).

Patch by Brian Bliss.

Differential Revision: http://reviews.llvm.org/D21300

llvm-svn: 272589
2016-06-13 21:28:03 +00:00
Jonathan Peyton
202a24dd9b Hwloc refactoring patch
These changes remove the hwloc_topology_ignore_type function which doesn't exist
in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc
has the cache levels stripped out and then assumes the final stripped topology
follows the typical three-level topology: packages -> cores -> HW threads.
But the code is doing unclean manipulations to determine at what level those
resources are located and also assumes too much about what hwloc is detecting
(there could be intermediate levels in between socket and core for instance).
This new way of extracting the topology doesn't strip out any hardware objects
that hwloc detects. It does not assume the three level topology, and instead
searches for the relevant three levels within the topology for each bit of
information using hwloc interface functions. i.e., the three level topology
subset that our affinity code is interested in is extracted from the hwloc
topology tree directly.

For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the
number of cores under a socket reliably without worrying if there are unexpected
objects between the socket object and core object in the hwloc topology
structure. Also, now that all topology information is kept, there are also
possibilities of using the caches/numa nodes to determine more sophisticated
affinity settings in the future.

There is also some cleanup code added for the destruction of the
__kmp_hwloc_topology object.

Differential Revision: http://reviews.llvm.org/D21195

llvm-svn: 272565
2016-06-13 17:30:08 +00:00
Jonathan Peyton
8407f5b3bd Remove architecture dependent Hwloc DEBUG section
This debug sections's functionality can be replicated using the environment
variable KMP_TOPOLOGY_METHOD with different values and KMP_AFFINITY=verbose

llvm-svn: 267472
2016-04-25 21:11:26 +00:00
Jonathan Peyton
1d5487c5d0 Fix buffer problem with printing long Hwloc affinity mask
This change has the hwloc_bitmap_list_snprintf() function use the entire buffer
to print the mask.  There is no need to shorten the buffer length by 7.  It only
needs to be shortened by one byte.

llvm-svn: 267470
2016-04-25 21:08:31 +00:00
Jonathan Peyton
3076fa4c35 New API for restoring current thread's affinity to init affinity of application
This new API, int kmp_set_thread_affinity_mask_initial(), is available for use
by other parallel runtime libraries inside a possibly OpenMP-registered thread.
This entry point restores the current thread's affinity mask to the affinity
mask of the application when it first began. If -1 is returned it can be assumed
that either the thread hasn't called affinity initialization or that the thread
isn't registered with the OpenMP library. If 0 is returned then, then the call
was successful. Any return value greater than zero indicates an error occurred
when setting affinity.

Differential Revision: http://reviews.llvm.org/D15867

llvm-svn: 257489
2016-01-12 17:21:55 +00:00
Jonathan Peyton
01dcf36bd5 Adding Hwloc library option for affinity mechanism
These changes allow libhwloc to be used as the topology discovery/affinity
mechanism for libomp.  It is supported on Unices. The code additions:
* Canonicalize KMP_CPU_* interface macros so bitmask operations are
  implementation independent and work with both hwloc bitmaps and libomp
  bitmaps.  So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and
  the like. These are all in kmp.h and appropriately placed.
* Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc
  interface to create a libomp address2os object which the rest of libomp knows
  how to handle already.
* To build, use -DLIBOMP_USE_HWLOC=on and
  -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake
  can't find the library or hwloc.h, then it will tell you and exit.

Differential Revision: http://reviews.llvm.org/D13991

llvm-svn: 254320
2015-11-30 20:02:59 +00:00
Jonathan Peyton
7dee82e729 Improvements to machine_hierarchy code for re-sizing
These changes include:
 1) Machine hierarchy now uses the base_num_threads field to indicate the 
    maximum number of threads the current hierarchy can handle without a resize.
 2) In __kmp_get_hierarchy, we need to get depth after any potential resize
    is done.
 3) Cleanup of hierarchy resize code to support 1 above.

Differential Revision: http://reviews.llvm.org/D14455

llvm-svn: 252475
2015-11-09 16:24:53 +00:00
Jonathan Peyton
6778c73243 Fix OMP_PLACES negation operator parsing (!place)
Just moved the *scan++ line up before the recursive call.  Otherwise,
infinite recursion occurs and leads to a segmentation fault.

llvm-svn: 250729
2015-10-19 19:43:01 +00:00
Jonathan Peyton
dd4aa9b6b5 Added sockets to the syntax of KMP_PLACE_THREADS environment variable.
Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable.
Some limitations:
* The number of sockets and then optional offset should be specified first (before other parameters).
* The letter designation is mandatory for sockets and then for other parameters.
* If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped.
* If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively.
* The number of cores per socket cannot be specified before sockets or after threads per core.
* The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset);
* Parameters delimiter can be: empty, comma, lower-case x;
* Spaces are allowed around numbers, around letters, around delimiter.
Approximate shorthand specification:
KMP_PLACE_THREADS="[num_sockets(S|s)[[delim]offset(O|o)][delim]][num_cores_per_socket(C|c)[[delim]offset(O|o)][delim]][num_threads_per_core(T|t)]"

Differential Revision: http://reviews.llvm.org/D13175

llvm-svn: 249708
2015-10-08 17:55:54 +00:00
Jonathan Peyton
7edeef1bbf Fix memory corruption in Windows debug library
This patch adjusts the buffer size when reducing the buffer used for printing.
This solves the memory corruption in Windows debug library, and potential
memory corruption in other builds.

llvm-svn: 248588
2015-09-25 17:23:17 +00:00
Jonathan Peyton
df4d3dd659 Fix depth field bug and resize() function in hierarchical barrier
This is a follow up to the hierarchy cleanup patch.
Added some clarifying comments to hierarchy_info.
Fixed a bug with the depth field not being updated cleanly during a resize.
Fixed resize to first check capacity as determined by maxLevels before actually doing the full resize.

Differential Revision: http://reviews.llvm.org/D12562

llvm-svn: 247333
2015-09-10 20:34:32 +00:00
Jonathan Peyton
1707836b68 Cleanup of affinity hierarchy code.
Some of this is improvement to code suggested by Hal Finkel. Four changes here:
1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not
2.Separated this and other classes and common functions out to a header file
3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup)
4.Remove some redundant code that is hopefully no longer needed

Differential Revision: http://reviews.llvm.org/D12449

llvm-svn: 247326
2015-09-10 19:22:07 +00:00
Jonathan Peyton
62f3840c9b Fix machine topology pruning.
This patch fixes a bug when eliminating layers in the machine topology (namely
cores, and threads). Before this patch, if a user specifies using only one 
thread per socket, then affinity is not set properly due to bad topology
pruning.

Differential Revision: http://reviews.llvm.org/D11158

llvm-svn: 245966
2015-08-25 18:44:41 +00:00
Jonathan Peyton
7f09a98ab1 Allow machine hierarchy expansion
This fix allows the machine hierarchy to be expanded in case it needs to handle 
more threads. It adds a resize function to accomplish this.

Differential Revision: http://reviews.llvm.org/D9900

llvm-svn: 240292
2015-06-22 15:59:18 +00:00
Jonathan Peyton
7be075335d Re-enable Visual Studio Builds.
I tried to compile with Visual Studio using CMake and found these two sections of code 
causing problems for Visual Studio.  The first one removes the use of variable length 
arrays by instead using KMP_ALLOCA().  The second part eliminates a redundant cpuid 
assembly call by using the already existing __kmp_x86_cpuid() call instead.

llvm-svn: 240290
2015-06-22 15:53:50 +00:00
Jonathan Peyton
663382950d Apply name change to src/* files.
These changes are mostly in comments, but there are a few
that aren't.  Change libiomp5 => libomp everywhere.  One internal
function name is changed in kmp_gsupport.c, and in kmp_i18n.c, the
static char[] variable 'name' is changed to "libomp".

llvm-svn: 238712
2015-06-01 02:37:28 +00:00
Jonathan Peyton
caf09fe022 Fix comment about balanced affinity
A while back, Hal mentioned fixing a comment concerning balanced affinity.
http://lists.cs.uiuc.edu/pipermail/openmp-dev/2014-December/000358.html
I forgot about fixing it until now, but now is better than never.

llvm-svn: 238378
2015-05-27 23:27:33 +00:00
Andrey Churbanov
aa1f2b6306 The generation of the hierarchy used by hierarchical barrier improved in how the generation reacts to affinity set to none, or disabled, or no affinity available, or oversubscription. Some cleanup actions based on review comments to follow: need to use meaningful names instead of digital constants, e.g. use enumerators.
llvm-svn: 234775
2015-04-13 18:51:59 +00:00
Andrey Churbanov
74bf17b8ff Replace some unsafe API calls with safe alternatives on Windows, prepare code for similar actions on other platforms - wrap unsafe API calls into macros.
llvm-svn: 233915
2015-04-02 13:27:08 +00:00
Andrey Churbanov
1362ae750f Eliminated the write to depth field of the machine_hierarchy data structure in __kmp_get_hierarchy(), thus fixing race condition. Now local variable used by each thread.
llvm-svn: 233914
2015-04-02 13:18:50 +00:00
Andrey Churbanov
16a1432176 issuing of incorrect warning fixed
llvm-svn: 231779
2015-03-10 09:34:38 +00:00
Andrey Churbanov
1f037e495a cleanup: usages of mask size wrapped into macros
llvm-svn: 231775
2015-03-10 09:15:26 +00:00
Andrey Churbanov
128755741f changed unsigned types to signed - caused by comments of Hal Finkel on one of earlier patches
llvm-svn: 231773
2015-03-10 09:00:36 +00:00
Andrey Churbanov
e4b9213f80 minor change: comment improved
llvm-svn: 231381
2015-03-05 17:46:50 +00:00
Andrey Churbanov
b41e62b713 Fixed memory corruption problem.
llvm-svn: 228736
2015-02-10 20:10:21 +00:00
Andrey Churbanov
5cd50e3c0a enable environment variable KMP_PLACE_THREADS also for non-MIC architectures
llvm-svn: 227467
2015-01-29 17:14:58 +00:00
Andrey Churbanov
4b2f17a1d3 fixing typo in error message
llvm-svn: 227451
2015-01-29 15:49:22 +00:00
Andrey Churbanov
d9e775edfc Comments only: removing the Revision and Date svn variables from the top of all the source files.
llvm-svn: 227207
2015-01-27 17:13:53 +00:00