llvm-project

Author	SHA1	Message	Date
Shilei Tian	5d07162bba	[OpenMP] Fix the test issue when `libomp` is built as a static library (#113522 )	2024-10-24 12:52:17 -04:00
Nikita Popov	4722c6b87c	[openmp] Use core_siblings_list if physical_package_id not available (#111831 ) On powerpc, physical_package_id may not be available. Currently, this causes openmp to fall back to flat topology and various affinity tests fail. Fix this by parsing core_siblings_list to deterimine which cpus belong to the same socket. This matches what the testing code does. The code to parse the CPU list format thankfully already exists. Fixes https://github.com/llvm/llvm-project/issues/111809.	2024-10-14 09:23:41 +02:00
Hansang Bae	9e0ee0e104	[OpenMP] Add support for pause with omp_pause_stop_tool (#97100 ) This patch adds support for pause resource with a new enumerator omp_pause_stop_tool. The expected behavior of this enumerator is * omp_pause_resource: not allowed * omp_pause_resource_all: equivalent to omp_pause_hard	2024-08-15 11:44:50 -05:00
Jonathan Peyton	2e57e63666	[OpenMP][libomp] Fix tasking debug assert (#95823 ) The debug assert is meant to check that the index is a valid which means the runtime needs to check against the size of the array instead of the number of threads. A free()-ed thread put back in the thread pool may index into anywhere inside the task team's available array from 0 to tt_max_threads potentially. Fixes: #94260	2024-07-24 12:25:21 -05:00
Michael Kruse	5c93a94f5a	[Clang][OpenMP] Add interchange directive (#93022 ) Add the interchange directive which will be introduced in the upcoming OpenMP 6.0 specification. A preview has been published in [Technical Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).	2024-07-19 09:24:40 +02:00
Michael Kruse	80865c01e1	[Clang][OpenMP] Add reverse directive (#92916 ) Add the reverse directive which will be introduced in the upcoming OpenMP 6.0 specification. A preview has been published in [Technical Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf). --------- Co-authored-by: Alexey Bataev <a.bataev@outlook.com>	2024-07-18 10:35:32 +02:00
Hansang Bae	7a72856af8	[OpenMP] Use new OMPT state and sync kinds for barrier events (#95602 ) This change makes the runtime use new OMPT state and sync kinds introduced in OpenMP 5.1 in place of the deprecated implicit state and sync kinds. Events from implicit barriers use different enumerators for workshare, parallel, and teams.	2024-07-16 09:52:20 -05:00
Joachim	a707d0883b	[OpenMP][OMPT] Indicate loop schedule for worksharing-loop events (#97429 ) Use more specific values from `ompt_work_t` to allow the tool identify the schedule of a worksharing-loop. With this patch, the runtime will report the schedule chosen by the runtime rather than necessarily the schedule literally requested by the clause. E.g., for guided + just one iteration per thread, the runtime would choose and report static. Fixes issue #63904	2024-07-03 09:33:19 +02:00
Terry Wilmarth	ac9f06c2a8	[OpenMP] Fix test omp_parallel_num_threads_list.c to require fewer threads. (#96916 ) Original test case used too many threads for some environments. This update reduces to a max of 36 threads.	2024-06-28 00:17:11 +03:00
Terry Wilmarth	d30b082fd4	[OpenMP] Add num_threads clause list format and strict modifier support (#85466 ) Add support to the runtime for 6.0 spec features that allow num_threads clause to take a list, and also make use of the strict modifier. Provides new compiler interface functions for these features.	2024-06-24 15:39:18 -04:00
Michael Kruse	9120562dfc	[Clang][OpenMP] Enable tile/unroll on iterator- and foreach-loops (#91459 ) OpenMP loop transformation did not work on a for-loop using an iterator or range-based for-loops. The first reason is that it combined the iterator's type for generated loops with the type of `NumIterations` as generated for any `OMPLoopBasedDirective` which is an integer. Fixed by basing all generated loop variables on `NumIterations`. Second, C++11 range-based for-loops include syntactic sugar that needs to be executed before the loop. This additional code is now added to the construct's Pre-Init lists. Third, C++20 added an initializer statement to range-based for-loops which is also added to the pre-init statement. PreInits used to be a `DeclStmt` which made it difficult to add arbitrary statements from `CXXRangeForStmt`'s syntactic sugar, especially the for-loops init statement which does not need to be a declaration. Change it to be a general `Stmt` that can be a `CompoundStmt` to hold arbitrary Stmts, including DeclStmts. This also avoids the `PointerUnion` workaround used by `checkTransformableLoopNest`. End-to-end tests are added to verify the expected number and order of loop execution and evaluations of expressions (such as iterator dereference). The order and number of evaluations of expressions in canonical loops is explicitly undefined by OpenMP but checked here for clarification and for changes to be noticed.	2024-05-22 14:30:31 +02:00
Michael Kruse	b0b6c16b47	[Clang][OpenMP][Tile] Allow non-constant tile sizes. (#91345 ) Allow non-constants in the `sizes` clause such as ``` #pragma omp tile sizes(a) for (int i = 0; i < n; ++i) ``` This is permitted since tile was introduced in [OpenMP 5.1](https://www.openmp.org/spec-html/5.1/openmpsu53.html#x78-860002.11.9). It is possible to sneak-in negative numbers at runtime as in ``` int a = -1; #pragma omp tile sizes(a) ``` Even though it is not well-formed, it should still result in every loop iteration to be executed exactly once, an invariant of the tile construct that we should ensure. `ParseOpenMPExprListClause` is extracted-out to be reused by the `permutation` clause of the `interchange` construct. Some care was put into ensuring correct behavior in template contexts.	2024-05-13 16:10:58 +02:00
chandan singh	2a57657d55	[OpenMP] [Flang] Resolved Issue llvm#76121: Implemented Check for Unhandled Arguments in __kmpc_fork_call_if (#82221 ) Root cause: Segmentation fault is caused by null pointer dereference inside the __kmpc_fork_call_if function at https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_asm.S#L1186 . __kmpc_fork_call_if is missing case to handle argc=0 . Fix: Added a check inside the __kmp_invoke_microtask function to handle the case when argc is 0. --------- Co-authored-by: Singh <chasingh@amd.com>	2024-05-09 11:11:04 +05:30
Jonathan Peyton	73bb8d9d92	[OpenMP] Fix child processes to use affinity_none (#91391 ) When a child process is forked with OpenMP already initialized, the child process resets its affinity mask and sets proc-bind-var to false so that the entire original affinity mask is used. This patch corrects an issue with the affinity initialization code setting affinity to compact instead of none for this special case of forked children. The test trying to catch this only testing explicit setting of KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting. Fixes: #91098	2024-05-08 09:23:50 -05:00
Jonathan Peyton	41ca9104ac	[OpenMP] Fix task state and taskteams for serial teams (#86859 ) * Serial teams now use a stack (similar to dispatch buffers) * Serial teams always use `t_task_team[0]` as the task team and the second pointer is a next pointer for the stack `t_task_team[1]` is interpreted as a stack of task teams where each level is a nested level ``` inner serial team outer serial team [ t_task_team[0] ] -> (task_team) [ t_task_team[0] ] -> (task_team) [ next ] ----------------> [ next ] -> ... ``` * Remove the task state memo stack from thread structure. * Instead of a thread-private stack, use team structure to store th_task_state of the primary thread. When coming out of a parallel, restore the primary thread's task state. The new field in the team structure doesn't cause sizeof(team) to change and is in the cache line which is only read/written by the primary thread. Fixes: #50602 Fixes: #69368 Fixes: #69733 Fixes: #79416	2024-05-07 08:41:51 -05:00
Xing Xue	0a8cd1ed1f	[OpenMP] Use half of available logical processors for collapse tests (#88319 ) The new collapse test cases define `MAX_THREADS` to be 256 and use all available threads/logical processors on the system. This triples the testing time on an AIX machine that has 128 logical processors. This patch changes to use half of available logical processors to avoid over subscribing because there are other libomp tests running at the same time, including 2 other such collapse tests.	2024-04-19 09:08:31 -04:00
Xing Xue	22bba85d82	[OpenMP][test][AIX] Make 64 the max number of threads for capacity tests in AIX 32-bit (#88739 ) This patch makes 64 the max number of threads for 2 capacity tests in AIX 32-bit mode rather than `XFAIL`ing them.	2024-04-16 13:14:29 -04:00
Xing Xue	b3792ae42a	[OpenMP][AIX] Fix test config for AIX (#88272 ) This patch fixes the test config so that it works for `tasking/omp50_taskdep_depobj.c` which uses different flags to test with compiler's `omp.h`. * set test environment variable `OBJECT_MODE` to `64` if it is set explicitly to `64` in the AIX environment. `OBJECT_MODE` is default to `32` and is recognized by AIX compilers and toolchain. In this way, we don't need to set `-m64` for all compiler flags for 64-bit mode * add option `-Wl,-bmaxdata` to 32-bit `test_openmp_flags` used by `tasking/omp50_taskdep_depobj.c`	2024-04-10 16:06:31 -04:00
Jonathan Peyton	eeaaf33fc2	[OpenMP] Unsupport absolute KMP_HW_SUBSET test for s390x (#87555 )	2024-04-04 13:54:40 -05:00
Jonathan Peyton	2ff3850ea1	[OpenMP] Add absolute KMP_HW_SUBSET functionality (#85326 ) Users can put a : in front of KMP_HW_SUBSET to indicate that the specified subset is an "absolute" subset. Currently, when a user puts KMP_HW_SUBSET=1t. This gets translated to KMP_HW_SUBSET="s,c,1t", where * means "use all of". If a user wants only one thread as the entire topology they can now do KMP_HW_SUBSET=:1t. Along with the absolute syntax is a fix for newer machines and making them easier to use with only the 3-level topology syntax. When a user puts KMP_HW_SUBSET=1s,4c,2t on a machine which actually has 4 layers, (say 1s,2m,3c,2t as the entire machine) the user gets an unexpected "too many resources asked" message because KMP_HW_SUBSET currently translates the "4c" value to mean 4 cores per module. To help users out, the runtime can assume that these newer layers, module in this case, should be ignored if they are not specified, but the topology should always take into account the sockets, cores, and threads layers.	2024-04-03 11:43:23 -05:00
Jonathan Peyton	4ea24946e3	[OpenMP] Fix nested parallel with tasking (#87309 ) When a nested parallel region ends, the runtime calls __kmp_join_call(). During this call, the primary thread of the nested parallel region will reset its tid (retval of omp_get_thread_num()) to what it was in the outer parallel region. A data race occurs with the current code when another worker thread from the nested inner parallel region tries to steal tasks from the primary thread's task deque. The worker thread reads the tid value directly from the primary thread's data structure and may read the wrong value. This change just uses the calculated victim_tid from execute_tasks() directly in the steal_task() routine rather than reading tid from the data structure. Fixes: #87307	2024-04-02 15:56:50 -05:00
nihui	c5bbdb6494	[OpenMP] arm64_32 port for Apple WatchOS (#87246 ) detect `aarch64_32` with compiler defined macro `__ARM64_ARCH_8_32__` reuse ARM `__kmp_unnamed_critical_addr` and add `KMP_PREFIX_UNDERSCORE` macro like AARCH64 reuse AARCH64 `__kmp_invoke_microtask` build log for watchos armv7k + arm64_32 and watchos simulator x86_64 + arm64 https://github.com/nihui/action-protobuf/actions/runs/8520684611/job/23337305030	2024-04-02 11:38:32 -04:00
Jonathan Peyton	038e66fe59	[OpenMP] Have hidden helper team allocate new OS threads only (#87119 ) The hidden helper team pre-allocates the gtid space [1, num_hidden_helpers] (inclusive). If regular host threads are allocated, then put back in the thread pool, then the hidden helper team is initialized, the hidden helper team tries to allocate the threads from the thread pool with gtids higher than [1, num_hidden_helpers]. Instead, have the hidden helper team fork OS threads so the correct gtid range used for hidden helper threads. Fixes: #87117	2024-03-29 17:26:00 -05:00
Vadim Paretsky	7db4046322	[OpenMP] add loop collapse tests (#86243 ) This PR adds loop collapse tests ported from MSVC. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>	2024-03-26 16:41:31 -07:00
Xing Xue	d394f3a162	[OpenMP][AIX] Affinity implementation for AIX (#84984 ) This patch implements `affinity` for AIX, which is quite different from platforms such as Linux. - Setting CPU affinity through masks and related functions are not supported. System call `bindprocessor()` is used to bind a thread to one CPU per call. - There are no system routines to get the affinity info of a thread. The implementation of `get_system_affinity()` for AIX gets the mask of all available CPUs, to be used as the full mask only. - Topology is not available from the file system. It is obtained through system SRAD (Scheduler Resource Allocation Domain). This patch has run through the libomp LIT tests successfully with `affinity` enabled.	2024-03-22 15:25:08 -04:00
Brad Smith	c7de4a39d5	[OpenMP] Enable the affinity tests on FreeBSD, NetBSD and DragonFly (#85500 ) FreeBSD, NetBSD and DragonFly also have affinity support. So enable the tests there as well.	2024-03-19 13:29:19 -04:00
Vadim Paretsky	110141b378	[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540 ) MSVC does not define __BYTE_ORDER__ making the check for BigEndian erroneously evaluate to true and breaking the struct definitions in MSVC compiled builds correspondingly. The fix adds an additional check for whether __BYTE_ORDER__ is defined by the compiler to fix these. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>	2024-03-09 10:47:31 -08:00
vadikp-intel	fcd2d48325	[OpenMP] runtime support for efficient partitioning of collapsed triangular loops (#83939 ) This PR adds OMP runtime support for more efficient partitioning of certain types of collapsed loops that can be used by compilers that support loop collapsing (i.e. MSVC) to achieve more optimal thread load balancing. In particular, this PR addresses double nested upper and lower isosceles triangular loops of the following types 1. lower triangular 'less_than' for (int i=0; i<N; i++) for (int j=0; j<i; j++) 2. lower triangular 'less_than_equal' for (int i=0; i<N; j++) for (int j=0; j<=i; j++) 3. upper triangular for (int i=0; i<N; i++) for (int j=i; j<N; j++) Includes tests for the three supported loop types. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>	2024-03-07 16:28:03 -08:00
Jonathan Peyton	0e0bee26e7	[OpenMP] Fix distributed barrier hang for OMP_WAIT_POLICY=passive (#83058 ) The resume thread logic inside __kmp_free_team() is faulty. Only checking b_go for sleep status doesn't wake up distributed barrier. Change to generic check for th_sleep_loc and calling __kmp_null_resume_wrapper(). Fixes: #80664	2024-02-27 14:15:48 -06:00
Xing Xue	a4dcfbcb78	[OpenMP][AIX] XFAIL capacity tests on AIX in 32-bit (#83014 ) This patch XFAILs two capacity tests on AIX in 32-bit because running out resource with `4 x omp_get_max_threads()` in 32-bit mode.	2024-02-26 13:13:05 -05:00
Martin Storsjö	4b9c089381	[OpenMP] [test] Skip the -mlong-double-80 test on MSVC ABI (#81115 ) Within the MSVC ABI, long doubles are the same as regular 64 bit doubles. This test case, which is compiled with -mlong-double-80, cannot work when libomp has been compiled without that flag, as -mlong-double-80 changes the calling convention for the tested functions.	2024-02-19 11:33:28 +02:00
Xing Xue	7a9b0e4acb	[OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases (#79895 ) This patch flips bit-fields in `struct flags` for big-endian in test cases to be consistent with the definition of the structure in libomp `kmp.h`.	2024-02-07 15:24:52 -05:00
Xing Xue	2edce427a8	[openmp][AIX]Initial changes for porting to AIX (#76841 ) This PR contains initial changes for building and testing libomp on AIX. More changes will follow. - `KMP_OS_AIX` is defined for the AIX platform - `KMP_ARCH_PPC` is defined for 32-bit PPC - `KMP_ARCH_PPC_XCOFF` and `KMP_ARCH_PPC64_XCOFF` are for 32- and 64-bit XCOFF object formats respectively - Assembly file `z_AIX_asm.S` is used for AIX specific assembly code and will be added in a separate PR - The target library is disabled because AIX does not have the device support - OMPT is temporarily disabled	2024-01-08 08:33:00 -05:00
Carlos Eduardo Seo	dcd7c8b7c9	[OpenMP][AArch64] Workaround for ompt/synchronization tests (#75848 ) ompt/synchronization/[masked.c \| master.c] tests fail due to a wrong offset being calculated for the possible return addreses. PR #65936 fixes this for Darwin and the same has to be done for Linux. Updates #69627	2023-12-19 19:26:23 +01:00
Sandeep Kosuri	ecc080c07d	[OpenMP] return empty stmt for `nothing` (#74042 ) - `nothing` directive was effecting the `if` block structure which it should not. So return an empty statement instead of an error statement while parsing to avoid this.	2023-12-03 13:33:38 +05:30
Alex	d6f00654fb	[OpenMP][Runtime][test] Fix ompt task testcase fail randomly (#72337 ) Fixed #72231	2023-11-28 14:22:57 +01:00
Lixi Zhou	a3c0f705db	[NFC] fix failed ompt tests on M1 device (#65696 ) Fix the 2 failed ompt tests on M1 device found on #63194. ``` libomp :: ompt/synchronization/masked.c libomp :: ompt/synchronization/master.c ``` For the details of this fix, please check the origin discussion in https://github.com/llvm/llvm-project/issues/63194#issuecomment-1710494689 Thanks @jprotze for the fix.	2023-11-24 23:40:14 +01:00
Joachim Jenke	f5e50b21da	[OpenMP] Optimized trivial multiple edges from task dependency graph From "3.1 Reducing the number of edges" of this [[ https://hal.science/hal-04136674v1/ \| paper ]] - Optimization (b) Task (dependency) nodes have a `successors` list built upon passed dependency. Given the following code, B will be added to A's successors list building the graph `A` -> `B` ``` // A # pragma omp task depend(out: x) {} // B # pragma omp task depend(in: x) {} ``` In the following code, B is currently added twice to A's successor list ``` // A # pragma omp task depend(out: x, y) {} // B # pragma omp task depend(in: x, y) {} ``` This patch removes such dupplicates by checking lastly inserted task in `A` successor list. Authored by: Romain Pereira (rpereira-dev) Differential Revision: https://reviews.llvm.org/D158544	2023-11-21 18:36:12 +01:00
Jonathan Peyton	5cc603cb22	[OpenMP] Add skewed iteration distribution on hybrid systems (#69946 ) This commit adds skewed distribution of iterations in nonmonotonic:dynamic schedule (static steal) for hybrid systems when thread affinity is assigned. Currently, it distributes the iterations at 60:40 ratio. Consider this loop with dynamic schedule type, for (int i = 0; i < 100; ++i). In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations will be assigned to performance cores and 12 iterations will be assigned to efficient cores. Each thread with CORE core will process 5 iterations + extras and with ATOM core will process 3 iterations. Differential Revision: https://reviews.llvm.org/D152955	2023-11-08 10:19:37 -06:00
Neale Ferguson	1111ef0257	Add openmp support to System z (#66081 ) * openmp/README.rst - Add s390x to those platforms supported * openmp/libomptarget/plugins-nextgen/CMakeLists.txt - Add s390x subdirectory * openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt - Add s390x definitions * openmp/runtime/CMakeLists.txt - Add s390x to those platforms supported * openmp/runtime/cmake/LibompGetArchitecture.cmake - Define s390x ARCHITECTURE * openmp/runtime/cmake/LibompMicroTests.cmake - Add dependencies for System z (aka s390x) * openmp/runtime/cmake/LibompUtils.cmake - Add S390X to the mix * openmp/runtime/cmake/config-ix.cmake - Add s390x as a supported LIPOMP_ARCH * openmp/runtime/src/kmp_affinity.h - Define __NR_sched_[get\|set]addinity for s390x * openmp/runtime/src/kmp_config.h.cmake - Define CACHE_LINE for s390x * openmp/runtime/src/kmp_os.h - Add KMP_ARCH_S390X to support checks * openmp/runtime/src/kmp_platform.h - Define KMP_ARCH_S390X * openmp/runtime/src/kmp_runtime.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/kmp_tasking.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h - Define ITT_ARCH_S390X * openmp/runtime/src/z_Linux_asm.S - Instantiate __kmp_invoke_microtask for s390x * openmp/runtime/src/z_Linux_util.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/test/ompt/callback.h - Define print_possible_return_addresses for s390x * openmp/runtime/tools/lib/Platform.pm - Return s390x as platform and host architecture * openmp/runtime/tools/lib/Uname.pm - Set hardware platform value for s390x	2023-11-03 12:42:55 +01:00
Ilya Leoshkevich	77c2b623ca	[OpenMP][Tests] Sync struct DEP with the runtime (#69982 ) struct DEP defined in multiple testcases must correspond to runtime's struct kmp_depend_info. The former defines flags as int, and the latter as kmp_uint8_t. This discrepancy goes unnoticed on little-endian systems, but breaks big-endian ones. Make flags in struct DEP unsigned char.	2023-10-24 19:40:08 +02:00
Kazushi Marukawa	e8679b93da	[OpenMP][test][VE] Limit the number of AFFINITY_MAX_CPUS for VE (#65872 ) Limit the number of AFFINITY_MAX_CPUS for VE because VE's sched_getaffinity doesn't work correctly with large sized mask buffer.	2023-09-12 23:45:56 +09:00
Kazushi Marukawa	f8efa65ca5	[OpenMP][test][VE] Change to use VE_LD_LIBRARY_PATH for VE (#65869 ) Change to use VE_LD_LIBRARY_PATH for VE instead of LD_LIBRARY_PATH. The VE is connected to the host, and compiled test programs for VE is invoked on the host and transferred to the VE. If programs are compiled for the host, we use LD_LIBRARY_PATH. Otherwise, we use VE_LD_LIBRARY_PATH.	2023-09-10 12:07:16 +09:00
Kazushi (Jam) Marukawa	18b6724355	[OpenMP][VE] Support OpenMP runtime on VE Support OpenMP runtime library on VE. This patch makes OpenMP compilable for VE architecture. Almost all tests run correctly on VE. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D159401	2023-09-10 08:29:53 +09:00
Martin Storsjö	c2019c416c	[OpenMP] [test] Fix target_thread_limit.cpp to not assume 4 or more cores Previously, the test ran a section with #pragma omp target thread_limit(4) and expected it to execute exactly 4 times, even though it would in practice execute min(cores, 4) times. Increment a counter and check that it executed 1-4 times. Differential Revision: https://reviews.llvm.org/D159311	2023-09-01 21:16:58 +03:00
Sandeep Kosuri	08bbff4aad	[OpenMP] Codegen support for thread_limit on target directive for host offloading - This patch adds support for thread_limit clause on target directive according to OpenMP 51 [2.14.5] - The idea is to create an outer task for target region, when there is a thread_limit clause, and manipulate the thread_limit of task instead. This way, thread_limit will be applied to all the relevant constructs enclosed by the target region. Differential Revision: https://reviews.llvm.org/D152054	2023-08-26 22:18:49 -05:00
Jonathan Peyton	b34c7d8c8e	[OpenMP] Introduce hybrid core attributes to OMP_PLACES and KMP_AFFINITY * Add KMP_CPU_EQUAL and KMP_CPU_ISEMPTY to affinity mask API * Add printout of leader to hardware thread dump * Allow OMP_PLACES to restrict fullMask This change fixes an issue with the OMP_PLACES=resource(#) syntax. Before this change, specifying the number of resources did NOT change the default number of threads created by the runtime. e.g., OMP_PLACES=cores(2) would still create __kmp_avail_proc number of threads. After this change, the fullMask and __kmp_avail_proc are modified if necessary so that the final place list dictates which resources are available and how thus, how many threads are created by default. * Introduce hybrid core attributes to OMP_PLACES and KMP_AFFINITY For OMP_PLACES, two new features are added: 1) OMP_PLACES=cores:<attribute> where <attribute> is either intel_atom, intel_core, or eff# where # is 0 - number of core efficiencies-1. This syntax also supports the optional (#) number selection of resources. 2) OMP_PLACES=core_types\|core_effs where this setting will create the number of core_types (or core_effs\|core_efficiencies). For KMP_AFFINITY, the granularity setting is expanded to include two new keywords: core_type, and core_eff (or core_efficiency). This will set the granularity to include all cores with a particular core type (or efficiency). e.g., KMP_AFFINITY=granularity=core_type,compact will create threads which can float across a single core type. Differential Revision: https://reviews.llvm.org/D154547	2023-07-31 13:55:32 -05:00
Joachim Jenke	81bc7cf609	[OpenMP][NFC] lit: Allow setting default environment variables for test Add CHECK_OPENMP_ENV environment variable which will be passed to environment variables for test (make check-* target). This provides a handy way to exercise various openmp code with different settings during development. For example, to change default barrier pattern: ``` $ env CHECK_OPENMP_ENV="KMP_FORKJOIN_BARRIER_PATTERN=hier,hier \ KMP_PLAIN_BARRIER_PATTERN=hier,hier \ KMP_REDUCTION_BARRIER_PATTERN=hier,hier" \ ninja check-openmp ``` Even with this, each test can set appropriate environment variables if needed as before. Also, this commit adds missing documention about how to run tests in README. Patch provided by t-msn Differential Revision: https://reviews.llvm.org/D122645	2023-07-11 15:00:40 +02:00
Joachim Jenke	124d36e093	[OpenMP][OMPT] Change OMPT kind for OpenMP test lock functions The OpenMP specification mentions that omp_test_lock and omp_test_nest_lock dispatch OMPT callbacks with ompt_mutex_test_lock and ompt_mutex_test_nest_lock for their kind respectively. Previously, the values ompt_mutex_lock and ompt_mutex_nest_lock were used. This could cause issues in application relying on the kind to correctly determine lock states. This commit changes the kind to the expected ones. Also update callback.h and OMPT tests to reflect this change. Patch prepared by Thyre Differential Review: https://reviews.llvm.org/D153028 Differential Review: https://reviews.llvm.org/D153031 Differential Review: https://reviews.llvm.org/D153032	2023-07-07 14:49:47 +02:00
Joachim Jenke	6ef16f2618	[OpenMP] Add OMPT support for omp_all_memory task dependence omp_all_memory currently has no representation in OMPT. Adding new dependency flags as suggested by omp-lang issue #3007. Differential Revision: https://reviews.llvm.org/D111788	2023-07-07 13:44:53 +02:00

1 2 3 4 5 ...

428 Commits