326 Commits

Author SHA1 Message Date
Hans
9d1cb18d19
[Coroutines] Ignore instructions more aggressively in addMustTailToCoroResumes() (#85271)
The old code used isInstructionTriviallyDead() and removed instructions
when walking the path from a resume call to function return to check if
the call is in tail position.

However, since the code was walking forwards it was not able to get past
instructions such as:

  %gep = getelementptr inbounds i64, ptr %alloc.var, i32 0
  %foo = ptrtoint ptr %gep to i64

This patch instead ignores such instructions as long as their values are
not needed. This enables the code to emit tail calls in more situations.
2024-03-20 14:51:45 +01:00
Hans Wennborg
f2d02ce04f [Coroutines] Remove some stale FIXMEs (NFC)
The calls are already musttail.
2024-03-14 17:55:20 +01:00
fpasserby
f786881340
[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712)
Implement `llvm.coro.await.suspend` intrinsics, to deal with performance
regression after prohibiting `.await_suspend` inlining, as suggested in
#64945.
Actually, there are three new intrinsics, which directly correspond to
each of three forms of `await_suspend`:
```
void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
```
There are three different versions instead of one, because in `bool`
case it's result is used for resuming via a branch, and in
`coroutine_handle` case exceptions from `await_suspend` are handled in
the coroutine, and exceptions from the subsequent `.resume()` are
propagated to the caller.

Await-suspend block is simplified down to intrinsic calls only, for
example for symmetric transfer:
```
%id = call token @llvm.coro.save(ptr null)
%handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
call void @llvm.coro.resume(%handle)
%result = call i8 @llvm.coro.suspend(token %id, i1 false)
switch i8 %result, ...
```
All await-suspend logic is moved out into a wrapper function, generated
for each suspension point.
The signature of the function is `<type> wrapperFunction(ptr %awaiter,
ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on
the return type of `await_suspend`.
Intrinsic calls are lowered during `CoroSplit` pass, right after the
split.

Because I'm new to LLVM, I'm not sure if the helper function generation,
calls to them and lowering are implemented in the right way, especially
with regard to various metadata and attributes, i. e. for TBAA. All
things that seemed questionable are marked with `FIXME` comments.

There is another detail: in case of symmetric transfer raw pointer to
the frame of coroutine, that should be resumed, is returned from the
helper function and a direct call to `@llvm.coro.resume` is generated.
C++ standard demands, that `.resume()` method is evaluated. Not sure how
important is this, because code has been generated in the same way
before, sans helper function.
2024-03-11 10:00:00 +08:00
Stephen Tozer
85dc3dfb1f [DebugInfo][RemoveDIs] Fix incorrect test expect
Fixes: aadd7650447b
The above commit landed with an incorrect test expect, missing a `metadata`
prefix. This patch adds the expected prefix to the test.
2024-02-29 14:46:52 +00:00
Stephen Tozer
aadd765044
[DebugInfo][RemoveDIs] Prevent duplicate DPValues from being returned by findDbgIntrinsics (#82764)
Fixes the error described here:
a93a4ec7dd (commitcomment-138965199)

The function `findDbgIntrinsics` is used to return a list of debug
intrinsics and DPValues that use a given value, with the intent that no
duplicates are returned in either list. For DPValues, we've guarded
against DPValues that use a value multiple times as part of a DIArgList,
but we have not guarded against DPValues that use a value multiple times
as separate operands (currently only possible for `dbg_assign`s,
something I missed in my implementation of that type!). This patch adds
a guard, and also updates a test to cover this case.
2024-02-29 14:37:06 +00:00
Mogball
2e29c91b96 Revert "[Coro] [async] Disable inlining in async coroutine splitting (#80904)"
This reverts commit b1ac052ab07ea091c90c2b7c89445b2bfcfa42ab.

This commit breaks coroutine splitting for non-swift calling convention
functions. In this example:

```ll
; ModuleID = 'repro.ll'
source_filename = "stdlib/test/runtime/test_llcl.mojo"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @crash to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 }

define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 {
  ret void
}

define dso_local void @crash(ptr %0) #0 {
  %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0)
  %3 = call ptr @llvm.coro.begin(token %2, ptr null)
  %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0
  %5 = call ptr @llvm.coro.async.resume()
  store ptr %5, ptr %4, align 8
  %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison)
  ret void
}

define dso_local ptr @ctxt_proj_fn(ptr %0) #0 {
  ret ptr %0
}

; Function Attrs: nomerge nounwind
declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1

; Function Attrs: nounwind
declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2

; Function Attrs: nounwind
declare ptr @llvm.coro.begin(token, ptr writeonly) #2

; Function Attrs: nomerge nounwind
declare ptr @llvm.coro.async.resume() #1

attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
attributes #1 = { nomerge nounwind }
attributes #2 = { nounwind }
```

This verifier crashes after the `coro-split` pass with

```
cannot guarantee tail call due to mismatched parameter counts
  musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison)
LLVM ERROR: Broken function
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: opt ../../../reduced.ll -O0
 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28
 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36
 #4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10
 #8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7
 #10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40
 #11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23
 #12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69
 #13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28
 #14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70
 #15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32
 #16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41
 #17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24
 #18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7
 #19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7
 #20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7
 #21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14
 #22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12
 #23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13
 #24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12
 #26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18
 #28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74
 #31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
 #34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17
 #35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7
 #36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7
 #37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7
 #38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10
 #39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27
 #40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33
 #41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 #42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20
 #43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
 #44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335)
Aborted (core dumped)
2024-02-21 16:35:07 +00:00
Yuta Saito
8c5c4d9a63
[Coro][WebAssembly] Add tail-call check for async lowering (#81481)
This patch fixes a verifier error when async lowering is used for
WebAssembly target without tail-call feature. This missing check was
revealed by b1ac052ab07ea091c90c2b7c89445b2bfcfa42ab, which removed
inlining of the musttail'ed call and it started leaving the invalid call
at the verification stage. Additionally, `TTI::supportsTailCallFor` did
not respect the concrete TTI's `supportsTailCalls` implementation, so it
always returned true even though `supportsTailCalls` returned false, so
this patch also fixes the wrong CRTP base class implementation.
2024-02-20 11:58:44 +09:00
Arnold Schwaighofer
b1ac052ab0
[Coro] [async] Disable inlining in async coroutine splitting (#80904)
The call to the inlining utility does not update the call graph. Leading
to assertion failures when calling the call graph utility to update the
call graph.

Instead rely on an inline pass to run after coro splitting and use
alwaysinline annotations.

github.com/apple/swift/issues/68708
2024-02-07 13:44:22 -08:00
Nikita Popov
2d69827c5c [Transforms] Convert tests to opaque pointers (NFC) 2024-02-05 11:57:34 +01:00
Nikita Popov
90ba33099c
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.

This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.

The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.

Fixes https://github.com/llvm/llvm-project/issues/69841.
2024-01-24 15:25:29 +01:00
Wei Wang
9c978c9418
[coroutines] Use DILocation from new storage for hoisted dbg.declare (#75402)
Make the hoisted dbg.declare inherent the DILocation scope from the new
storage.

After hoisting, the dbg.declare is moved into the block that defines the
new storage. This could create an inconsistency in the debug location
scope hierarchy where the scope of hoisted dbg.declare (i.e.
DILexicalBlock) is enclosed with the scope of the block (i.e.
DISubprogram). This confuses LiveDebugValues pass to think that the
hoisted dbg.declare is killed in that block and does not generate
DBG_VALUE in other blocks. Debugger won't be able to track its value
anymore.

We do this for unoptimized binary only.
2024-01-02 09:54:16 -08:00
Orlando Cazalet-Hyams
fd8fa31c55
[RemoveDIs] Update Coroutine passes to handle DPValues (#74480)
As part of the RemoveDIs project, transitioning to non-instruction debug
info, all debug intrinsic handling code needs to be duplicated to handle
DPValues.

--try-experimental-debuginfo-iterators enables the new debug mode in
tests if the CMake option has been enabled.

`getInsertPtAfterFramePtr` now returns an iterator so we don't lose
debug-info-communicating bits.

---

Depends on #73500, #74090, #74091.
2023-12-13 12:34:37 +00:00
Matheus Izvekov
2fb060efd8
Revert "[coroutines] Use DILocation from new storage for hoisted dbg.declare" (#75282)
Reverts llvm/llvm-project#75104

Original commit causes clang to generate invalid IR:
```
mismatched subprogram between llvm.dbg.declare variable and !dbg attachment
  call void @llvm.dbg.declare(metadata ptr %4, metadata !34468, metadata !DIExpression(DW_OP_plus_uconst, 176)), !dbg !34467
```
2023-12-13 06:33:13 +01:00
Wei Wang
31cf6df06f
[coroutines] Use DILocation from new storage for hoisted dbg.declare (#75104)
Make the hoisted dbg.declare inherent the DILocation scope from the new
storage.

After hoisting, the dbg.declare is moved into the block that defines the
new storage. This could create an inconsistency in the debug location
scope hierarchy where the scope of hoisted dbg.declare (i.e.
DILexicalBlock) is enclosed with the scope of the block (i.e.
DISubprogram). This confuses LiveDebugValues pass to think that the
hoisted dbg.declare is killed in that block and does not generate
DBG_VALUE in other blocks. Debugger won't be able to track its value
anymore.
2023-12-12 09:47:46 -08:00
Mircea Trofin
284da049f5
[coro][pgo] Don't promote pgo counters in the suspend basic block (#71263)
If a suspend happens in the resume part (this can happen in the case of chained coroutines), and that's part of a loop, the pre-split CFG has the suspend block as an exit of that loop. PGO Counter Promotion will then try to commit the temporary counter to the global in that "exit" block (it also does that in the other loop exit BBs, which also includes
the "destroy" case). This interferes with symmetric transfer.

We don't need to commit the counter in the suspend case - it's not a loop exit from the perspective of the behavior of the program. The regular loop exit, together with the "destroy" case, completely cover any updates that may need to happen to the global counter.
2023-11-30 11:58:26 -08:00
Youngsuk Kim
f42eb15c39
[llvm][Coroutines] Remove no-op ptr-to-ptr bitcasts (NFC) (#73427)
Opaque ptr cleanup effort
2023-11-26 09:22:12 -05:00
Mircea Trofin
ffd337b995
[coro][pgo] Do not insert counters in the suspend block (#71262)
If we did, we couldn't lower symmetric transfer resumes to tail calls.

We can instrument the other 2 edges instead, as long as they also don't
point to the same basic block.
2023-11-15 11:12:59 -08:00
Chuanqi Xu
b7b5907b56
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close https://github.com/llvm/llvm-project/issues/56980.

This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.

The rationale behind the patch is simple. See the example:

```C++
A foo() {
  dtor d;
  co_await something();
  dtor d1;
  co_await something();
  dtor d2;
  co_return 43;
}
```

Generally the generated .destroy function may be:

```C++
void foo.destroy(foo.Frame *frame) {
  switch(frame->suspend_index()) {
    case 1:
      frame->d.~dtor();
      break;
    case 2:
      frame->d.~dtor();
      frame->d1.~dtor();
      break;
    case 3:
      frame->d.~dtor();
      frame->d1.~dtor();
      frame->d2.~dtor();
      break;
    default: // coroutine completed or haven't started
      break;
  }

  frame->promise.~promise_type();
  delete frame;
}
```

Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.

However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:

```C++
void foo.destroy(foo.Frame *frame) {
  frame->promise.~promise_type();
  delete frame;
}
```

I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
2023-11-09 14:42:07 +08:00
Felipe de Azevedo Piovezan
eb6dee613e
[Corosplit][DebugInfo] Don't add EntryValue ops in variadic DIExpressions (#67179)
These are not supported by the backend.

The comment that got deleted was out of place, and it exists in the call
sites of this function.
2023-09-22 16:43:36 -04:00
Ruiling, Song
ed9b354379
Coroutines: Handle non-zero stack address space (#67092)
The stack might be in a different address space, in which case, bitcast
does not work. We should use addrspacecast. As we do not support typed
pointer anymore, so we do not need a bitcast here anymore.
2023-09-22 20:29:44 +08:00
Bruno Cardoso Lopes
34415fd611
[Clang][LLVM][Coroutines] Prevent __coro_gro from outliving __promise (#66706)
When dealing with short-circuiting coroutines (e.g. expected), the
deferred calls that resolve the get_return_object are currently being
emitted after we delete the coroutine frame.

This was caught by ASAN when using optimizations -O1 and above:
optimizations after inlining would place the __coro_gro in the heap, and
subsequent delete of the coroframe followed by the conversion -> BOOM.

This patch forbids the GRO to be placed in the coroutine frame, by
adding a new metadata node that can be attached to `alloca`
instructions.

Fix #49843
2023-09-21 22:52:05 -07:00
Anton Korobeynikov
51d5d7bbae
Extend retcon.once coroutines lowering to optionally produce a normal result (#66333)
One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.

However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).

The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
2023-09-15 09:54:38 -07:00
DianQK
19b664d352
[Coroutine][DebugInfo] Remove the memory attributes on coro-async-declaration.ll (NFC) (#66088)
According to @drodriguez's reminder in
https://github.com/apple/llvm-project/pull/7168#issuecomment-1710607896,
`memory` breaks the backport to the apple branch.

And this is irrelevant to that test. Delete to get better a test case.
2023-09-14 09:02:58 +08:00
Anton Korobeynikov
1a0cbb9c32
[NFC] Update coroutine intrinsics documentation and few remaining tests to opaque pointers (#65698) 2023-09-08 12:32:06 -07:00
Felipe de Azevedo Piovezan
aefa9ff3ec [CoroSplit][DebugInfo] Don't use entry_value for async args in 32-bit targets
Only X86_64 and ARM64 have a reserved register for async arguments, and so the
debugger is only able to handle those targets. For other architectures, we use a
non-entry-value expression and let the debugger do its best with that.

Differential Revision: https://reviews.llvm.org/D158638
2023-08-24 08:50:12 -04:00
DianQK
3aed00239c
[Coroutine][DebugInfo] Reduced test case for coro-async-declaration.ll. (NFC)
Based on the description at https://reviews.llvm.org/D157177#inline-1529005, I try to improve the test case.

Reviewed By: fdeazeve

Differential Revision: https://reviews.llvm.org/D158178
2023-08-17 22:24:38 +08:00
Felipe de Azevedo Piovezan
8aa038ab17 [CoroSplit][DebugInfo] Don't use entry_value in coroutine entry point
The entry point function is called as a regular function. Among other things, it
can be inlined, which would violate the semantics of entry_value in the IR.

Differential Revision: https://reviews.llvm.org/D158108
2023-08-17 09:15:17 -04:00
DianQK
ca1a5b37c7
[Coroutine][DebugInfo] Update the linkage name of the declaration of coro-split functions in the debug info.
This patch adds the linkage name update to DISubprogram's declaration after 6ce76ff7eb7640e53b65f0473848ce7d08165c98.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D157184
2023-08-08 08:22:46 +08:00
DianQK
88a83c9038
[Coroutine][DebugInfo] Pre-commit test for a DISubprogram with declaration. (NFC)
Pre-commit test for D157184.

Differential Revision: https://reviews.llvm.org/D157177
2023-08-08 08:22:45 +08:00
Nuno Lopes
9e1b6817f1 [CoroSplit] Use poison instead of undef as placeholder [NFC]
Used to construct full structs/vectors
also, covert freeze undef -> freeze poison (same semantics)
2023-07-22 12:56:03 +01:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Chuanqi Xu
9bdf04c8f9 [Coroutines] Prevent infinite loop in simplifyTerminatorLeadingToRet
Close https://github.com/llvm/llvm-project/issues/63639

This comes from the oversight the refactoring that we missed a `return
false;` in the loop.
2023-07-03 10:40:38 +08:00
Chuanqi Xu
0a7ff0960e [Coroutines] Don't transform cmpinst prematurely in simplifyTerminatorLeadingToRet
Previously, we would try to transform cmpinst in
simplifyTerminatorLeadingToRet if we found it was a constant. However,
this is incorrect.

Since the resolved constants in simplifyTerminatorLeadingToRet are not
truely constants. They are basically constants along cerntain code
paths.

In this way, it is clearly incorrect to transform the compare
instruction to a constant.

It will cause confusing miscompilations. This patch tries to fix this.
2023-06-30 14:27:19 +08:00
Chuanqi Xu
b6f30623af [Coroutines] Store the index for final suspend point in the exception path
Try to address part of
https://github.com/llvm/llvm-project/issues/61900.

It is not completely addressed since the original reproducer is not
fixed due to the final suspend point is optimized out in its special
case. But that is a relatively independent issue.
2023-06-20 18:38:05 +08:00
Felipe de Azevedo Piovezan
617c9d59df [Corosplit] Prepend entry_value in swift async dbg values
When the coroutine splitter splits swift coroutines, variables in the new
funclets are now described in terms of the frame pointer, which is always placed
at a ABI-specified register whose contents are valid upon function entry. As
such, debug intrinsics must be prepended by the `entry_value` operation.

Depends on D149778

Differential Revision: https://reviews.llvm.org/D149779
2023-05-10 14:38:19 -04:00
Felipe de Azevedo Piovezan
290494955c [coroutine] Salvage dbg.values in the original function as well
D97673 implemented salvaging o dbg.value inside coroutine funclets, but
left the original function untouched. Before, only dbg.addr and dbg.decl
would get salvaged.

D121324 implemented salvaging of dbg.addr and dbg.decl in the original
function as well, but not of dbg.values.

This patch unifies salvaging in the original function and related
funclets, so that all intrinsics are salvaged in all functions. This is
particularly useful for ABIs where the original function is also
rewritten to receive the frame pointer as an argument.

Differential Revision: https://reviews.llvm.org/D148745
2023-04-21 09:31:39 -04:00
Nikita Popov
a0d2fc126e [Coroutines] Convert tests to opaque pointers (NFC) 2023-04-20 17:24:01 +02:00
Nikita Popov
243e62b9d8 [Coroutines] Directly remove unnecessary lifetime intrinsics
The insertSpills() code will currently skip lifetime intrinsic users
when replacing the alloca with a frame reference. Rather than
leaving behind the dead lifetime intrinsics working on the old
alloca, directly remove them. This makes sure the alloca can be
dropped as well.

I noticed this as a regression when converting tests to opaque
pointers. Without opaque pointers, this code didn't really do
anything, because there would usually be a bitcast in between.
The lifetimes would get rewritten to the frame pointer. With
opaque pointers, this code now triggers and leaves behind users
of the old allocas.

Differential Revision: https://reviews.llvm.org/D148240
2023-04-14 10:22:30 +02:00
Nikita Popov
9f22401a59 [Coroutines] Convert test to opaque pointers (NFC) 2023-04-05 16:59:40 +02:00
Nikita Popov
f3a7783bf3 [Coroutines] Convert some tests to opaque pointers (NFC) 2023-04-05 15:55:06 +02:00
Wei Wang
b6eadb6c1b [Coroutines] Look for dbg.declare for temp spills
A temp spill may not have direct dbg.declare attached. This can cause problem for debugger when it wants to print the value in resume/destroy/cleanup functions.

In particular, we found this happening to "this" pointer that a temp is used to store its value in entry block and spilled later.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D146543
2023-03-30 10:12:23 -07:00
Wei Wang
013f6d23e6 [Coroutines] Add remarks in CoroSplit and CoroElide passes
Add remarks to show frame size and alignment.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D146175
2023-03-16 09:21:35 -07:00
J. Ryan Stinnett
29a6b7dfae [DebugInfo] Remove dbg.addr from Coroutines
This removes `dbg.addr` support from the Coroutines transform. This effectively
reverts the `dbg.addr`-only portions of 19279ffc77b8d224c447d4eb0ee0c727ab64babf
and 0b647fc5299156bf83c46aa539d6c9c39647bb36.

Part of `dbg.addr` removal
Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898

Differential Revision: https://reviews.llvm.org/D144795
2023-03-02 09:29:42 +00:00
Ting Wang
65f68812d3 [PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions
This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check
`PPCTTIImpl::supportsTailCallFor()`.

Fixes #59315

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D140369
2023-02-28 22:29:16 -05:00
Nick Desaulniers
45a291b5f6 [Dominators] check indirect branches of callbr
This will be necessary to support outputs from asm goto along indirect
edges.

Test via:
  $ pushd llvm/build; ninja IRTests; popd
  $ ./llvm/build/unittests/IR/IRTests \
    --gtest_filter=DominatorTree.CallBrDomination

Also, return nullptr in Instruction::getInsertionPointAfterDef for
CallBrInst as was recommened in
https://reviews.llvm.org/D135997#3991427.  The following phab review was
folded into this commit: https://reviews.llvm.org/D140166

Link: Link: https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8

Reviewed By: void, efriedma, ChuanqiXu, MaskRay

Differential Revision: https://reviews.llvm.org/D135997
2023-02-16 17:58:33 -08:00
Chuanqi Xu
af838c1b1c [Coroutines] Don't run optimizations for optnone functions
Currently we will run two optimization (rematerialization and sink
lifetime markers) unconditionally even if the coroutine is marked as
optnone (O0). This looks not good. This patch disables these 2
optimizations for optnone functions. An internal change shows the change
improve the compilation time for 3% in the debug build.
2023-02-14 15:21:48 +08:00
David Stuttard
3e51af9b5b [Coroutines] Improve rematerialization stage
As originally implemented, the rematerialization of valid instructions across
the suspend point would iterate 4 times, meaning that up to 4 instructions could
be rematerialized.

This implementation changes that approach to instead build a graph of
rematerializable instructions, then move all of them. This is faster than the
original approach and is not limited to an arbitrary limit.

Differential Revision: https://reviews.llvm.org/D142620
2023-02-13 11:02:20 +00:00
David Stuttard
35106ad100 [Coroutines] Presubmit test for more coro remats
Added more tests that check for >4 instructions.
Also added a retcon-remat test that checks rematerialization into a suspend
block predecessor (such as when remat for a retcon suspend happens).

Differential Revision: https://reviews.llvm.org/D142619
2023-02-13 11:02:08 +00:00
Nikita Popov
9ed2f14c87 [AsmParser] Remove typed pointer auto-detection
IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.

The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.

Differential Revision: https://reviews.llvm.org/D141912
2023-01-18 09:58:32 +01:00
Nikita Popov
5867241eac [Transforms] Convert some tests to opaque pointers (NFC) 2023-01-06 12:14:45 +01:00