Shilei Tian e21202dac1 [Clang][OpenMP] Fix the issue that llvm.lifetime.end is emitted too early for variables captured in linear clause
Currently if an OpenMP program uses `linear` clause, and is compiled with
optimization, `llvm.lifetime.end` for variables listed in `linear` clause are
emitted too early such that there could still be uses after that. Let's take the
following code as example:
```
// loop.c
int j;
int *u;

void loop(int n) {
  int i;
  for (i = 0; i < n; ++i) {
    ++j;
    u = &j;
  }
}
```
We compile using the command:
```
clang -cc1 -fopenmp-simd -O3 -x c -triple x86_64-apple-darwin10 -emit-llvm loop.c -o loop.ll
```
The following IR (simplified) will be generated:
```
@j = local_unnamed_addr global i32 0, align 4
@u = local_unnamed_addr global ptr null, align 8

define void @loop(i32 noundef %n) local_unnamed_addr {
entry:
  %j = alloca i32, align 4
  %cmp = icmp sgt i32 %n, 0
  br i1 %cmp, label %simd.if.then, label %simd.if.end

simd.if.then:                                     ; preds = %entry
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %j)
  store ptr %j, ptr @u, align 8
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr @j, align 4
  br label %simd.if.end

simd.if.end:                                      ; preds = %simd.if.then, %entry
  ret void
}
```
The most important part is:
```
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr @j, align 4
```
`%j` is still loaded after `@llvm.lifetime.end.p0(i64 4, ptr nonnull %j)`. This
could cause the backend incorrectly optimizes the code and further generates
incorrect code. The root cause is, when we emit a construct that could have
`linear` clause, it usually has the following pattern:
```
EmitOMPLinearClauseInit(S)
{
  OMPPrivateScope LoopScope(*this);
  ...
  EmitOMPLinearClause(S, LoopScope);
  ...
  (void)LoopScope.Privatize();
  ...
}
EmitOMPLinearClauseFinal(S, [](CodeGenFunction &) { return nullptr; });
```
Variables that need to be privatized are added into `LoopScope`, which also
serves as a RAII object. When `LoopScope` is destructed and if optimization is
enabled, a `@llvm.lifetime.end` is also emitted for each privatized variable.
However, the writing back to original variables in `linear` clause happens after
the scope in `EmitOMPLinearClauseFinal`, causing the issue we see above.

A quick "fix" seems to be, moving `EmitOMPLinearClauseFinal` inside the scope.
However, it doesn't work. That's because the local variable map has been updated
by `LoopScope` such that a variable declaration is mapped to the privatized
variable, instead of the actual one. In that way, the following code will be
generated:
```
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr %j, align 4
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
```
Well, now the life time is correct, but apparently the writing back is broken.

In this patch, a new function `OMPPrivateScope::restoreMap` is added and called
before calling `EmitOMPLinearClauseFinal`. This can make sure that
`EmitOMPLinearClauseFinal` can find the orignal varaibls to write back.

Fixes #56913.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D131272
2022-08-06 16:50:37 -04:00

33 lines
1.5 KiB
C

// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --include-generated-funcs --prefix-filecheck-ir-name _
// RUN: %clang_cc1 -fopenmp-simd -O1 -x c -triple x86_64-apple-darwin10 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK
int j;
int *u;
void loop(int n) {
int i;
#pragma omp parallel master taskloop simd linear(j)
for (i = 0; i < n; ++i) {
++j;
u = &j;
}
}
// CHECK-LABEL: define {{[^@]+}}@loop
// CHECK-SAME: (i32 noundef [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: [[J:%.*]] = alloca i32, align 4
// CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N]], 0
// CHECK-NEXT: br i1 [[CMP]], label [[SIMD_IF_THEN:%.*]], label [[SIMD_IF_END:%.*]]
// CHECK: simd.if.then:
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr @j, align 4, !tbaa [[TBAA2:![0-9]+]]
// CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 4, ptr nonnull [[J]]) #[[ATTR2:[0-9]+]]
// CHECK-NEXT: store ptr [[J]], ptr @u, align 8, !tbaa [[TBAA6:![0-9]+]], !llvm.access.group [[ACC_GRP8:![0-9]+]]
// CHECK-NEXT: [[INC_LE:%.*]] = add i32 [[TMP0]], [[N]]
// CHECK-NEXT: store i32 [[INC_LE]], ptr [[J]], align 4, !tbaa [[TBAA2]]
// CHECK-NEXT: store i32 [[INC_LE]], ptr @j, align 4, !tbaa [[TBAA2]]
// CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 4, ptr nonnull [[J]]) #[[ATTR2]]
// CHECK-NEXT: br label [[SIMD_IF_END]]
// CHECK: simd.if.end:
// CHECK-NEXT: ret void
//