llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	c5488c8dcc	[OpenMP] Properly set static thread limit (w/o analysis) We used to have two separate implementations to derive the number of threads used in a target region. This lead us to sometimes miss out on user provided thread bounds (num_threads, or thread_limit) when we looked for "constant default values". If we might miss out on the presence of those bounds, we cannot set the thread_limit statically since the runtime will try to honor user input rather than cap it at the "preferred default". This patch replaces the secondary implementation with the primary in a mode that will not emit code but just look for the presence, and potentially upper bounds, of thread limiting clauses. The runtime test would not pass without this rewrite as we missed some clauses, set the static limit on the device to the preferred value, but then violated that value at runtime. Fixes: https://github.com/llvm/llvm-project/issues/64845 Differential Revision: https://reviews.llvm.org/D158381	2023-08-23 11:12:03 -07:00
Itay Bookstein	782c59a4ee	[OpenMP] Prefix outlined and reduction func names with original func's name This patch prefixes omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 23:00:26 +03:00
Itay Bookstein	6fdd13e0ec	Revert "[OpenMP] Prefix outlined and reduction func names with original func's name" This reverts commit 029bfc311d4d7d3cd90be81bb08c046848796d02.	2023-04-19 19:08:49 +03:00
Itay Bookstein	029bfc311d	[OpenMP] Prefix outlined and reduction func names with original func's name This patch attempts to prefix omp outlined helpers and reduction funcs with the original function's name. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140722	2023-04-19 19:05:21 +03:00
Dhruva Chakrabarti	1c9ec74e3f	[Clang][OpenMP] Insert alloca for kernel args at function entry block instead of the launch point. If an inlined kernel is called in a loop, the launch point alloca would lead to increasing stack usage every time the kernel is invoked. This could make the application run out of stack space and crash. This problem is fixed by using the alloca insertion point while creating the alloca instruction. Fixes https://github.com/llvm/llvm-project/issues/60602 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D145820	2023-03-17 16:36:12 -04:00

5 Commits