llvm-project

Author	SHA1	Message	Date
CHANDRA GHALE	afbcf9529a	[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (#134709 ) Codegen support for reduction over private variable with reduction clause. Section 7.6.10 in in OpenMP 6.0 spec. - An internal shared copy is initialized with an initializer value. - The shared copy is updated by combining its value with the values from the private copies created by the clause. - Once an encountering thread verifies that all updates are complete, its original list item is updated by merging its value with that of the shared copy and then broadcast to all threads. Sample Test Case from OpenMP 6.0 Example ``` #include <assert.h> #include <omp.h> #define N 10 void do_red(int n, int *v, int &sum_v) { sum_v = 0; // sum_v is private #pragma omp for reduction(original(private),+: sum_v) for (int i = 0; i < n; i++) { sum_v += v[i]; } } int main(void) { int v[N]; for (int i = 0; i < N; i++) v[i] = i; #pragma omp parallel num_threads(4) { int s_v; // s_v is private do_red(N, v, s_v); assert(s_v == 45); } return 0; } ``` Expected Codegen: ``` // A shared global/static variable is introduced for the reduction result. // This variable is initialized (e.g., using memset or a UDR initializer) // e.g., .omp.reduction.internal_private_var // Barrier before any thread performs combination call void @__kmpc_barrier(...) // Initialization block (executed by thread 0) // e.g., call void @llvm.memset.p0.i64(...) or call @udr_initializer(...) call void @__kmpc_critical(...) // Inside critical section: // Load the current value from the shared variable // Load the thread-local private variable's value // Perform the reduction operation // Store the result back to the shared variable call void @__kmpc_end_critical(...) // Barrier after all threads complete their combinations call void @__kmpc_barrier(...) // Broadcast phase: // Load the final result from the shared variable) // Store the final result to the original private variable in each thread // Final barrier after broadcast call void @__kmpc_barrier(...) ``` --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-06-11 14:01:31 +05:30
Nikita Popov	e2b536431d	[CodeGen] Move CodeGenPGO behind unique_ptr (NFC) (#142155 ) The InstrProf headers are very expensive. Avoid including them in all of CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr. This reduces clang build time by 0.8%.	2025-06-02 09:51:54 +02:00
Devon Loehr	63de20c0de	Reland "Add macro to suppress -Wunnecessary-virtual-specifier" (#141091 ) This fixes #139614 on non-clang compilers by moving `__has_warning` completely inside the `#if defined(__clang__)` block. This prevents a parse failure from compilers which don't recognize `__has_warning`. Original description: Followup to #138741. This adds the requested macro to silence `-Wunnecessary-virtual-specifier` when declaring virtual anchor functions in `final` classes, per [LLVM policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers). It also cleans up any remaining instances of the warning, allowing us to stop disabling it when we build LLVM.	2025-05-28 12:15:22 +02:00
Philip Reames	e4e7a7e64e	Revert "Add macro to suppress -Wunnecessary-virtual-specifier (#139614 )" This reverts commit 0954c9d487e7cb30673df9f0ac125f71320d2936. It breaks the build when built with gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04).	2025-05-21 11:31:26 -07:00
Devon Loehr	0954c9d487	Add macro to suppress -Wunnecessary-virtual-specifier (#139614 ) Followup to #138741. This adds the requested macro to silence `-Wunnecessary-virtual-specifier` when declaring virtual anchor functions in `final` classes, per [LLVM policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers). It also cleans up any remaining instances of the warning, allowing us to stop disabling it when we build LLVM.	2025-05-21 10:54:36 -07:00
Kazu Hirata	f002f300c5	[clang] Remove unused local variables (NFC) (#138453 )	2025-05-04 10:51:40 -07:00
Tom Eccles	7b70fc74d0	[mlir][OpenMP] Convert omp.cancel sections to LLVMIR (#137193 ) This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.	2025-04-29 17:19:40 +01:00
Nikita Popov	b384d6d6cc	[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100 ) This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.	2025-04-03 08:04:19 +02:00
Zahira Ammarguellat	cf69b4c668	[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#126927 ) This patch was reviewed and approved here: https://github.com/llvm/llvm-project/pull/119891 However it has been reverted here: `083df25dc2` due to a build issue here: https://lab.llvm.org/buildbot/#/builders/51/builds/10694 This patch is reintroducing the support.	2025-02-13 07:14:36 -05:00
Matt	a1826b4d26	[OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering (#126172 ) A proposed fix for the issue #95611, [OpenMP][SIMD] ordered has no effect in a loop SIMD region as of LLVM 18.1.0 Changes: - Implement new lowering behavior: Conservatively serialize "omp simd" loops that have `omp simd ordered` directive to prevent incorrect vectorization (which results in incorrect execution behavior of the miscompiled program). Implementation outline: - We start with the optimistic default initial value of `LoopStack.setParallel(/Enable=/true);` in `CodeGenFunction::EmitOMPSimdInit(const OMPLoopDirective &D)`. - We only disable the loop parallel memory access assumption with `if (HasOrderedDirective) LoopStack.setParallel(/Enable=/false);` using the `HasOrderedDirective` (which tests for the presence of an `OMPOrderedDirective`). - This results in no longer incorrectly vectorizing the loop when the `omp simd ordered` directive is present. Motivation: We'd like to prevent incorrect vectorization of the loops marked with the `#pragma omp ordered simd` directive which has previously resulted in miscompiled code. At the same time, we'd like the usage outside of the `#pragma omp ordered simd` context to remain unaffected: Note that in the test "clang/test/OpenMP/ordered_codegen.cpp" we only "lose" the `!llvm.access.group` metadata in `foo_simd` alone. This is conservative, in that it's possible some of the loops would be possible to vectorize, but we prefer to avoid miscompilation of the loops that are currently illegal to vectorize. A concrete example follows: ```cpp // "test.c" #include <float.h> #include <math.h> #include <omp.h> #include <stdio.h> #include <stdlib.h> #include <time.h> int compare_float(float x1, float x2, float scalar) { const float diff = fabsf(x1 - x2); x1 = fabsf(x1); x2 = fabsf(x2); const float l = (x2 > x1) ? x2 : x1; if (diff <= l * scalar * FLT_EPSILON) return 1; else return 0; } #define ARRAY_SIZE 256 __attribute__((noinline)) void initialization_loop( float X[ARRAY_SIZE][ARRAY_SIZE], float Y[ARRAY_SIZE][ARRAY_SIZE]) { const float max = 1000.0; srand(time(NULL)); for (int r = 0; r < ARRAY_SIZE; r++) { for (int c = 0; c < ARRAY_SIZE; c++) { X[r][c] = ((float)rand() / (float)(RAND_MAX)) * max; Y[r][c] = X[r][c]; } } } __attribute__((noinline)) void omp_simd_loop(float X[ARRAY_SIZE][ARRAY_SIZE]) { for (int r = 1; r < ARRAY_SIZE; ++r) { for (int c = 1; c < ARRAY_SIZE; ++c) { #pragma omp simd for (int k = 2; k < ARRAY_SIZE; ++k) { #pragma omp ordered simd X[r][k] = X[r][k - 2] + sinf((float)(r / c)); } } } } __attribute__((noinline)) int comparison_loop(float X[ARRAY_SIZE][ARRAY_SIZE], float Y[ARRAY_SIZE][ARRAY_SIZE]) { int totalErrors_simd = 0; const float scalar = 1.0; for (int r = 1; r < ARRAY_SIZE; ++r) { for (int c = 1; c < ARRAY_SIZE; ++c) { for (int k = 2; k < ARRAY_SIZE; ++k) { Y[r][k] = Y[r][k - 2] + sinf((float)(r / c)); } } // check row for simd update for (int k = 0; k < ARRAY_SIZE; ++k) { if (!compare_float(X[r][k], Y[r][k], scalar)) { ++totalErrors_simd; } } } return totalErrors_simd; } int main(void) { float X[ARRAY_SIZE][ARRAY_SIZE]; float Y[ARRAY_SIZE][ARRAY_SIZE]; initialization_loop(X, Y); omp_simd_loop(X); const int totalErrors_simd = comparison_loop(X, Y); if (totalErrors_simd) { fprintf(stdout, "totalErrors_simd: %d \n", totalErrors_simd); fprintf(stdout, "%s : %d - FAIL: error in ordered simd computation.\n", __FILE__, __LINE__); } else { fprintf(stdout, "Success!\n"); } return totalErrors_simd; } ``` Before: ``` $ clang -fopenmp-simd -O3 -ffast-math -lm test.c -o test && ./test totalErrors_simd: 15408 test.c : 76 - FAIL: error in ordered simd computation. ``` clang 19.1.0: https://godbolt.org/z/6EvhxqEhe After: ``` $ clang -fopenmp-simd -O3 -ffast-math test.c -o test && ./test Success! ``` Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>	2025-02-12 08:53:47 -05:00
Kazu Hirata	67e1e98811	Revert "[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891 )" This reverts commit 070f84ebc89b11df616a83a56df9ac56efbab783. Buildbot failure: https://lab.llvm.org/buildbot/#/builders/51/builds/10694	2025-02-11 12:39:01 -08:00
Zahira Ammarguellat	070f84ebc8	[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891 ) Implement basic parsing and semantic support for `#pragma omp stripe` constuct introduced in https://www.openmp.org/wp-content/uploads/[OpenMP-API-Specification-6-0.pdf](https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf), section 11.7.	2025-02-11 13:58:21 -05:00
Alexey Bataev	3041dd5c20	Revert "[OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering" (#126079 ) Reverts llvm/llvm-project#123867 to fix the test failures https://lab.llvm.org/buildbot/#/builders/144/builds/17521	2025-02-06 10:04:11 -05:00
Matt	60d8e6f528	[OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering (#123867 ) A proposed fix for #95611 [OpenMP][SIMD] ordered has no effect in a loop SIMD region as of LLVM 18.1.0 Changes: - Implement new lowering behavior: Conservatively serialize "omp simd" loops that have `omp simd ordered` directive to prevent incorrect vectorization (which results in incorrect execution behavior of the miscompiled program). Implementation outline: - We start with the optimistic default initial value of `LoopStack.setParallel(/Enable=/true);` in `CodeGenFunction::EmitOMPSimdInit(const OMPLoopDirective &D)`. - We only disable the loop parallel memory access assumption with `if (HasOrderedDirective) LoopStack.setParallel(/Enable=/false);` using the `HasOrderedDirective` (which tests for the presence of an `OMPOrderedDirective`). - This results in no longer incorrectly vectorizing the loop when the `omp simd ordered` directive is present. Motivation: We'd like to prevent incorrect vectorization of the loops marked with the `#pragma omp ordered simd` directive which has previously resulted in miscompiled code. At the same time, we'd like the usage outside of the `#pragma omp ordered simd` context to remain unaffected: Note that in the test "clang/test/OpenMP/ordered_codegen.cpp" we only "lose" the `!llvm.access.group` metadata in `foo_simd` alone. This is conservative, in that it's possible some of the loops would be possible to vectorize, but we prefer to avoid miscompilation of the loops that are currently illegal to vectorize. A concrete example follows: ```cpp // "test.c" #include <float.h> #include <math.h> #include <omp.h> #include <stdio.h> #include <stdlib.h> #include <time.h> int compare_float(float x1, float x2, float scalar) { const float diff = fabsf(x1 - x2); x1 = fabsf(x1); x2 = fabsf(x2); const float l = (x2 > x1) ? x2 : x1; if (diff <= l * scalar * FLT_EPSILON) return 1; else return 0; } #define ARRAY_SIZE 256 __attribute__((noinline)) void initialization_loop( float X[ARRAY_SIZE][ARRAY_SIZE], float Y[ARRAY_SIZE][ARRAY_SIZE]) { const float max = 1000.0; srand(time(NULL)); for (int r = 0; r < ARRAY_SIZE; r++) { for (int c = 0; c < ARRAY_SIZE; c++) { X[r][c] = ((float)rand() / (float)(RAND_MAX)) * max; Y[r][c] = X[r][c]; } } } __attribute__((noinline)) void omp_simd_loop(float X[ARRAY_SIZE][ARRAY_SIZE]) { for (int r = 1; r < ARRAY_SIZE; ++r) { for (int c = 1; c < ARRAY_SIZE; ++c) { #pragma omp simd for (int k = 2; k < ARRAY_SIZE; ++k) { #pragma omp ordered simd X[r][k] = X[r][k - 2] + sinf((float)(r / c)); } } } } __attribute__((noinline)) int comparison_loop(float X[ARRAY_SIZE][ARRAY_SIZE], float Y[ARRAY_SIZE][ARRAY_SIZE]) { int totalErrors_simd = 0; const float scalar = 1.0; for (int r = 1; r < ARRAY_SIZE; ++r) { for (int c = 1; c < ARRAY_SIZE; ++c) { for (int k = 2; k < ARRAY_SIZE; ++k) { Y[r][k] = Y[r][k - 2] + sinf((float)(r / c)); } } // check row for simd update for (int k = 0; k < ARRAY_SIZE; ++k) { if (!compare_float(X[r][k], Y[r][k], scalar)) { ++totalErrors_simd; } } } return totalErrors_simd; } int main(void) { float X[ARRAY_SIZE][ARRAY_SIZE]; float Y[ARRAY_SIZE][ARRAY_SIZE]; initialization_loop(X, Y); omp_simd_loop(X); const int totalErrors_simd = comparison_loop(X, Y); if (totalErrors_simd) { fprintf(stdout, "totalErrors_simd: %d \n", totalErrors_simd); fprintf(stdout, "%s : %d - FAIL: error in ordered simd computation.\n", __FILE__, __LINE__); } else { fprintf(stdout, "Success!\n"); } return totalErrors_simd; } ``` Before: ``` $ clang -fopenmp-simd -O3 -ffast-math -lm test.c -o test && ./test totalErrors_simd: 15408 test.c : 76 - FAIL: error in ordered simd computation. ``` clang 19.1.0: https://godbolt.org/z/6EvhxqEhe After: ``` $ clang -fopenmp-simd -O3 -ffast-math test.c -o test && ./test Success! ``` Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>	2025-02-06 09:44:11 -05:00
CHANDRA GHALE	30f9a4f754	[OpenMP] codegen support for masked combined construct parallel masked taskloop simd. (#121746 ) Added codegen support for combined masked constructs `Parallel masked taskloop simd`. Added implementation for `EmitOMPParallelMaskedTaskLoopSimdDirective`. Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-14 18:26:46 +05:30
CHANDRA GHALE	6f558e0e12	[OpenMP] codegen support for masked combined construct masked taskloop (#121914 ) Added codegen support for combined masked constructs `masked taskloop.` Added implementation for `EmitOMPMaskedTaskLoopDirective`. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-13 11:42:13 +05:30
CHANDRA GHALE	1d2eea962a	[OpenMP] codegen support for masked combined construct masked taskloop simd (#121916 ) Added codegen support for combined masked constructs `masked taskloop simd`. Added implementation for `EmitOMPMaskedTaskLoopSimdDirective`. Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-12 23:38:00 +05:30
CHANDRA GHALE	aedb30fdc7	[OpenMP] codegen support for masked combined construct parallel masked taskloop (#121741 ) Added codegen support for combined masked constructs Parallel masked taskloop. Added implementation for EmitOMPParallelMaskedTaskLoopDirective. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-09 16:38:36 +05:30
Sergio Afonso	b79ed8729b	[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863 ) The preprocessor definition used to enable asserts and the one that `llvm::Error` and `llvm::Expected` use to ensure all created instances are checked are not the same. By making these checks inside of an `assert` in cases where errors are not expected, certain build configurations would trigger runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_UNREACHABLE_OPTIMIZE=ON`). The `llvm::cantFail()` function, which was intended for this use case, is used by this patch in place of `assert` to prevent these runtime failures. In tests, new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and `EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release builds.	2025-01-09 10:28:16 +00:00
CHANDRA GHALE	76e6c8d3fc	Codegen changes for strict modifier with grainsize/num_tasks of taskloop construct (#117196 ) Initial parsing/sema for 'strict' modifier with 'num_tasks' and ‘grainsize’ clause is present in these commits [grainsize_parsing](`ab9eac762c`) and [num_tasks_parsing](`56c1660170 (diff-4184486638e85284c3a2c961a81e7752231022daf97e411007c13a6732b50db9R6545)`) . However, this implementation appears incomplete as it lacks code generation support. A runtime patch was introduced in this runtime commit [runtime_patch](`540007b427 (diff-5e95f9319910d6965d09c301359dbe6b23f3eef5ce4d262ef2c2d2137875b5c4R374)`) , which adds a new API, _kmpc_taskloop_5, to accommodate the strict modifier. In this patch I have added codegen support. When the strict modifier is present alongside the grainsize or num_tasks clauses of taskloop construct, the code now emits a call to _kmpc_taskloop_5, which includes an additional parameter of type i32 with the value 1 to indicate the strict modifier. If the strict modifier is not present, it falls back to the existing _kmpc_taskloop API call. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2024-11-28 14:18:59 +05:30
CHANDRA GHALE	44a6b3a4b6	Fix for codegen Crash in Clang when using locator omp_all_memory with depobj construct (#114221 ) A codegen crash is occurring when a depend object was initialized with omp_all_memory in the depobj directive. https://github.com/llvm/llvm-project/issues/114214(url) The root cause of issue looks to be the improper handling of the dependency list when omp_all_memory was specified. The change introduces the use of OMPTaskDataTy to manage dependencies. The buildDependences function is called to construct the dependency list, and the list is iterated over to emit and store the dependencies. Reduced Test Case : ``` #include <omp.h> int main() { omp_depend_t obj; #pragma omp depobj(obj) depend(inout: omp_all_memory) } ``` ``` #1 0x0000000003de6623 SignalHandler(int) Signals.cpp:0:0 #2 0x00007f8e4a6b990f (/lib64/libpthread.so.0+0x1690f) #3 0x00007f8e4a117d2a raise (/lib64/libc.so.6+0x4ad2a) #4 0x00007f8e4a1193e4 abort (/lib64/libc.so.6+0x4c3e4) #5 0x00007f8e4a10fc69 __assert_fail_base (/lib64/libc.so.6+0x42c69) #6 0x00007f8e4a10fcf1 __assert_fail (/lib64/libc.so.6+0x42cf1) #7 0x0000000004114367 clang::CodeGen::CodeGenFunction::EmitOMPDepobjDirective(clang::OMPDepobjDirective const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4114367) #8 0x00000000040f8fac clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const, llvm::ArrayRef<clang::Attr const>) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40f8fac) #9 0x00000000040ff4fb clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40ff4fb) #10 0x00000000041847b2 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41847b2) #11 0x0000000004199e4a clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function, clang::CodeGen::CGFunctionInfo const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4199e4a) #12 0x00000000041f7b9d clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f7b9d) #13 0x00000000041f16a3 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f16a3) #14 0x00000000041fd954 clang::CodeGen::CodeGenModule::EmitDeferred() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41fd954) #15 0x0000000004200277 clang::CodeGen::CodeGenModule::Release() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4200277) #16 0x00000000046b6a49 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0 #17 0x00000000046b4cb6 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x46b4cb6) #18 0x0000000006204d5c clang::ParseAST(clang::Sema&, bool, bool) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x6204d5c) #19 0x000000000496b278 clang::FrontendAction::Execute() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x496b278) #20 0x00000000048dd074 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x48dd074) #21 0x0000000004a38092 clang::ExecuteCompilerInvocation(clang::CompilerInstance) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4a38092) #22 0x0000000000fd4e9c cc1_main(llvm::ArrayRef<char const>, char const, void) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd4e9c) #23 0x0000000000fcca73 ExecuteCC1Tool(llvm::SmallVectorImpl<char const>&, llvm::ToolContext const&) driver.cpp:0:0 #24 0x0000000000fd140c clang_main(int, char*, llvm::ToolContext const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd140c) #25 0x0000000000ee2ef3 main (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xee2ef3) #26 0x00007f8e4a10224c __libc_start_main (/lib64/libc.so.6+0x3524c) #27 0x0000000000fcaae9 _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120:0 clang: error: unable to execute command: Aborted ``` --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2024-11-11 14:34:16 +05:30
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
Congcong Cai	eca5949031	[codegen][NFC] add static mark for internal usage variable and function (#109431 ) Detect by clang-tidy misc-use-internal-linkage	2024-09-24 07:25:07 +08:00
David Pagan	d7c69c20a7	[clang][OpenMP] Add codegen for scope directive (#109197 ) Added codegen for scope directive, enabled allocate and firstprivate clauses, and added scope directive LIT test. Testing - LIT tests (including new scope test). - OpenMP scope example test from 5.2 OpenMP API examples document. - Three executable scope tests from OpenMP_VV/sollve_vv suite.	2024-09-19 13:17:24 -07:00
Shilei Tian	1c269929d0	[Clang][Sema][OpenMP] Allow `thread_limit` to accept multiple expressions (#102715 )	2024-08-10 09:54:58 -04:00
Shilei Tian	cee594cf36	[Clang][Sema][OpenMP] Allow `num_teams` to accept multiple expressions (#99732 ) By the OpenMP standard, `num_teams` clause can only accept one expression (for now). In this patch, we extend it to allow to accept multiple expressions when it is used with `target teams ompx_bare` construct. This will allow to launch a multi-dim grid, same as CUDA/HIP.	2024-08-06 10:55:15 -04:00
Julian Brown	a42e515e3a	[OpenMP] OpenMP 5.1 "assume" directive parsing support (#92731 ) This is a minimal patch to support parsing for "omp assume" directives. These are meant to be hints to a compiler's optimisers: as such, it is legitimate (if not very useful) to ignore them. The patch builds on top of the existing support for "omp assumes" directives (note spelling!). Unlike the "omp [begin/end] assumes" directives, "omp assume" is associated with a compound statement, i.e. it can appear within a function. The "holds" assumption could (theoretically) be mapped onto the existing builtin "__builtin_assume", though the latter applies to a single point in the program, and the former to a range (i.e. the whole of the associated compound statement). This patch fixes sollve's OpenMP 5.1 "omp assume"-based tests.	2024-08-05 07:37:07 -04:00
Krzysztof Parzyszek	243b27f7e1	[clang][OpenMP] Rename `varlists` to `varlist`, NFC (#101058 ) It returns a range of variables (via Expr*), not a range of lists.	2024-07-30 08:11:09 -05:00
Matt Arsenault	e108853ac8	clang: Allow targets to set custom metadata on atomics (#96906 ) Use this to replace the emission of the amdgpu-unsafe-fp-atomics attribute in favor of per-instruction metadata. In the future new fine grained controls should be introduced that also cover the integer cases. Add a wrapper around CreateAtomicRMW that appends the metadata, and update a few use contexts to use it.	2024-07-26 09:57:28 +04:00
Johannes Doerfert	3c8efd7928	[OpenMP] Ensure the actual kernel is annotated with launch bounds (#99927 ) In debug mode there is a wrapper (the kernel) around the function in which we generate the kernel code. We worked around this before to get the correct kernel name, but now we really distinguish both to attach the launch bounds to the kernel, not the inner function.	2024-07-23 09:02:47 -07:00
Krzysztof Parzyszek	c74730070a	[clang][OpenMP] Move "loop" directive mapping from sema to codegen (#99905 ) Given "loop" construct, clang will try to treat it as "for", "distribute" or "simd", depending on either the implied binding, or the bind clause if present. This patch moves the code that performs this construct remapping from sema to codegen. For a "loop" construct without a bind clause, this patch will create an implicit bind clause based on implied binding to simplify further analysis. During codegen the function `EmitOMPGenericLoopDirective` (i.e. "loop") will invoke the "emit" functions for "for", "distribute" or "simd", depending on the bind clause. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2024-07-23 07:31:42 -05:00
Michael Kruse	5c93a94f5a	[Clang][OpenMP] Add interchange directive (#93022 ) Add the interchange directive which will be introduced in the upcoming OpenMP 6.0 specification. A preview has been published in [Technical Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).	2024-07-19 09:24:40 +02:00
Michael Kruse	80865c01e1	[Clang][OpenMP] Add reverse directive (#92916 ) Add the reverse directive which will be introduced in the upcoming OpenMP 6.0 specification. A preview has been published in [Technical Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf). --------- Co-authored-by: Alexey Bataev <a.bataev@outlook.com>	2024-07-18 10:35:32 +02:00
Joseph Huber	5ef4e6db96	[OpenMP] Correctly code-gen default atomic mem order (#97663 ) Summary: The parsing for this was implemented, but we never hooked up the default value to the result of this clause. This patch adds the support by making it default to the requires directive.	2024-07-08 10:47:22 -07:00
Gheorghe-Teodor Bercea	1a478a69bc	[OpenMP][offload] Fix dynamic schedule tracking (#97065 ) This patch fixes the dynamic schedule tracking.	2024-07-01 10:23:11 -04:00
Julian Brown	6ba764a54e	[OpenMP] [NFC] SemaOpenMP.cpp and StmtOpenMP.cpp spelling fixes (#96814 ) This patch just fixes a few spelling mistakes in the above two files. (I changed one British spelling to American -- analyse to analyze -- because the latter spelling is used elsewhere in file, and it's probably best to be consistent.)	2024-06-28 12:36:20 -05:00
Pavel Samolysov	69e9e779b7	[clang] Replace X && isa<Y>(X) with isa_and_nonnull<Y>(X). NFC (#94987 ) This addresses a clang-tidy suggestion.	2024-06-11 05:30:50 +03:00
Michael Kruse	9120562dfc	[Clang][OpenMP] Enable tile/unroll on iterator- and foreach-loops (#91459 ) OpenMP loop transformation did not work on a for-loop using an iterator or range-based for-loops. The first reason is that it combined the iterator's type for generated loops with the type of `NumIterations` as generated for any `OMPLoopBasedDirective` which is an integer. Fixed by basing all generated loop variables on `NumIterations`. Second, C++11 range-based for-loops include syntactic sugar that needs to be executed before the loop. This additional code is now added to the construct's Pre-Init lists. Third, C++20 added an initializer statement to range-based for-loops which is also added to the pre-init statement. PreInits used to be a `DeclStmt` which made it difficult to add arbitrary statements from `CXXRangeForStmt`'s syntactic sugar, especially the for-loops init statement which does not need to be a declaration. Change it to be a general `Stmt` that can be a `CompoundStmt` to hold arbitrary Stmts, including DeclStmts. This also avoids the `PointerUnion` workaround used by `checkTransformableLoopNest`. End-to-end tests are added to verify the expected number and order of loop execution and evaluations of expressions (such as iterator dereference). The order and number of evaluations of expressions in canonical loops is explicitly undefined by OpenMP but checked here for clarification and for changes to be noticed.	2024-05-22 14:30:31 +02:00
Ahmed Bougacha	3575d23ca8	[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465 ) This is in effect a revert of f139ae3d93797, as we have since gained a more sophisticated way of doing extra IRGen with the addition of RawAddress in #86923.	2024-05-20 10:23:04 -07:00
Erich Keane	39adc8f423	[NFC] Generalize ArraySections to work for OpenACC in the future (#89639 ) OpenACC is going to need an array sections implementation that is a simpler version/more restrictive version of the OpenMP version. This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`, then adds an enum to choose between the two. This also fixes a couple of 'drive-by' issues that I discovered on the way, but leaves the OpenACC Sema parts reasonably unimplemented (no semantic analysis implementation), as that will be a followup patch.	2024-04-25 10:22:03 -07:00
Jie Fu	6ef4450705	[clang] Fix -Wunused-function in CGStmtOpenMP.cpp (NFC) llvm-project/clang/lib/CodeGen/CGStmtOpenMP.cpp:7959:13: error: unused function 'emitTargetTeamsLoopCodegenStatus' [-Werror,-Wunused-function] static void emitTargetTeamsLoopCodegenStatus(CodeGenFunction &CGF, ^ 1 error generated.	2024-04-11 07:37:12 +08:00
David Pagan	a12836647e	[OpenMP][CodeGen] Improved codegen for combined loop directives (#87278 ) IR for 'target teams loop' is now dependent on suitability of associated loop-nest. If a loop-nest: - does not contain a function call, or - the -fopenmp-assume-no-nested-parallelism has been specified, - or the call is to an OpenMP API AND - does not contain nested loop bind(parallel) directives then it can be emitted as 'target teams distribute parallel for', which is the current default. Otherwise, it is emitted as 'target teams distribute'. Added debug output indicating how 'target teams loop' was emitted. Flag is -mllvm -debug-only=target-teams-loop-codegen Added LIT tests explicitly verifying 'target teams loop' emitted as a parallel loop and a distribute loop. Updated other 'loop' related tests as needed to reflect change in IR. - These updates account for most of the changed files and additions/deletions.	2024-04-10 13:09:17 -07:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Akira Hatanaka	f75eebab88	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 )" (#86898 ) This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c. The commit broke ubsan bots.	2024-03-27 18:14:04 -07:00
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
Stephen Tozer	ffd08c7759	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216 ) This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.	2024-03-19 20:07:07 +00:00
Orlando Cazalet-Hyams	3e6db60291	[RemoveDIs] Update Clang front end to handle DbgRecords (#84756 ) This patch fixes problems that pop up when clang emits DbgRecords instead of debug intrinsics. Note: this doesn't mean clang is emitting DbgRecords yet, because the modules it creates are still always in the old debug mode. That will come in a future patch. Depends on #84739	2024-03-18 10:55:29 +00:00

1 2 3 4 5 ...

788 Commits