462 Commits

Author SHA1 Message Date
Hans
9d1cb18d19
[Coroutines] Ignore instructions more aggressively in addMustTailToCoroResumes() (#85271)
The old code used isInstructionTriviallyDead() and removed instructions
when walking the path from a resume call to function return to check if
the call is in tail position.

However, since the code was walking forwards it was not able to get past
instructions such as:

  %gep = getelementptr inbounds i64, ptr %alloc.var, i32 0
  %foo = ptrtoint ptr %gep to i64

This patch instead ignores such instructions as long as their values are
not needed. This enables the code to emit tail calls in more situations.
2024-03-20 14:51:45 +01:00
Stephen Tozer
ffd08c7759
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:

- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.

Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:

```
  DPValue -> DbgVariableRecord
  DPVal -> DbgVarRec
  DPV -> DVR
```

Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
2024-03-19 20:07:07 +00:00
Stephen Tozer
2e865353ed
[RemoveDIs][NFC] Move DPValue::filter -> filterDbgVars (#85208)
This patch changes DPValue::filter to be a non-member method
filterDbgVars. There are two reasons for this: firstly, the name of
DPValue is about to change to DbgVariableRecord, which will result in
every `for` loop that uses DPValue::filter to require a line break. This
is a small thing, but it makes the rename patch more difficult to
review, and is just generally more awkward for what is a fairly common
loop. Secondly, the intent is to later break up the DPValue class into
subclasses, at which point it would be better to have a non-member
function that allows template arguments for the cases we want to filter
with greater specificity.
2024-03-14 12:19:15 +00:00
Stephen Tozer
15f3f446c5
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.

This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
2024-03-12 14:53:13 +00:00
fpasserby
f786881340
[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712)
Implement `llvm.coro.await.suspend` intrinsics, to deal with performance
regression after prohibiting `.await_suspend` inlining, as suggested in
#64945.
Actually, there are three new intrinsics, which directly correspond to
each of three forms of `await_suspend`:
```
void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
```
There are three different versions instead of one, because in `bool`
case it's result is used for resuming via a branch, and in
`coroutine_handle` case exceptions from `await_suspend` are handled in
the coroutine, and exceptions from the subsequent `.resume()` are
propagated to the caller.

Await-suspend block is simplified down to intrinsic calls only, for
example for symmetric transfer:
```
%id = call token @llvm.coro.save(ptr null)
%handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
call void @llvm.coro.resume(%handle)
%result = call i8 @llvm.coro.suspend(token %id, i1 false)
switch i8 %result, ...
```
All await-suspend logic is moved out into a wrapper function, generated
for each suspension point.
The signature of the function is `<type> wrapperFunction(ptr %awaiter,
ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on
the return type of `await_suspend`.
Intrinsic calls are lowered during `CoroSplit` pass, right after the
split.

Because I'm new to LLVM, I'm not sure if the helper function generation,
calls to them and lowering are implemented in the right way, especially
with regard to various metadata and attributes, i. e. for TBAA. All
things that seemed questionable are marked with `FIXME` comments.

There is another detail: in case of symmetric transfer raw pointer to
the frame of coroutine, that should be resumed, is returned from the
helper function and a direct call to `@llvm.coro.resume` is generated.
C++ standard demands, that `.resume()` method is evaluated. Not sure how
important is this, because code has been generated in the same way
before, sans helper function.
2024-03-11 10:00:00 +08:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Mogball
2c30180f72 [coro] [async] Don't fail on targets that don't support tail calls 2024-02-21 17:48:58 +00:00
Mogball
2e29c91b96 Revert "[Coro] [async] Disable inlining in async coroutine splitting (#80904)"
This reverts commit b1ac052ab07ea091c90c2b7c89445b2bfcfa42ab.

This commit breaks coroutine splitting for non-swift calling convention
functions. In this example:

```ll
; ModuleID = 'repro.ll'
source_filename = "stdlib/test/runtime/test_llcl.mojo"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @crash to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 }

define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 {
  ret void
}

define dso_local void @crash(ptr %0) #0 {
  %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0)
  %3 = call ptr @llvm.coro.begin(token %2, ptr null)
  %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0
  %5 = call ptr @llvm.coro.async.resume()
  store ptr %5, ptr %4, align 8
  %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison)
  ret void
}

define dso_local ptr @ctxt_proj_fn(ptr %0) #0 {
  ret ptr %0
}

; Function Attrs: nomerge nounwind
declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1

; Function Attrs: nounwind
declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2

; Function Attrs: nounwind
declare ptr @llvm.coro.begin(token, ptr writeonly) #2

; Function Attrs: nomerge nounwind
declare ptr @llvm.coro.async.resume() #1

attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
attributes #1 = { nomerge nounwind }
attributes #2 = { nounwind }
```

This verifier crashes after the `coro-split` pass with

```
cannot guarantee tail call due to mismatched parameter counts
  musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison)
LLVM ERROR: Broken function
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: opt ../../../reduced.ll -O0
 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28
 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36
 #4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10
 #8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7
 #10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40
 #11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23
 #12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69
 #13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28
 #14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70
 #15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32
 #16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41
 #17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24
 #18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7
 #19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7
 #20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7
 #21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14
 #22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12
 #23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13
 #24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12
 #26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18
 #28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74
 #31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
 #34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17
 #35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7
 #36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7
 #37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7
 #38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10
 #39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27
 #40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33
 #41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 #42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20
 #43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
 #44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335)
Aborted (core dumped)
2024-02-21 16:35:07 +00:00
Orlando Cazalet-Hyams
ababa96475
[RemoveDIs][NFC] Introduce DbgRecord base class [1/3] (#78252)
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class

    -> 1. Add DbgRecord base class for DPValue and the not-yet-added
          DPLabel class.
       2. Add the DPLabel class.
       3. Enable dbg.label conversion and add support to passes.

Patches 1 and 2 are NFC.

In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
2024-02-20 16:00:55 +00:00
Yuta Saito
8c5c4d9a63
[Coro][WebAssembly] Add tail-call check for async lowering (#81481)
This patch fixes a verifier error when async lowering is used for
WebAssembly target without tail-call feature. This missing check was
revealed by b1ac052ab07ea091c90c2b7c89445b2bfcfa42ab, which removed
inlining of the musttail'ed call and it started leaving the invalid call
at the verification stage. Additionally, `TTI::supportsTailCallFor` did
not respect the concrete TTI's `supportsTailCalls` implementation, so it
always returned true even though `supportsTailCalls` returned false, so
this patch also fixes the wrong CRTP base class implementation.
2024-02-20 11:58:44 +09:00
Arnold Schwaighofer
b1ac052ab0
[Coro] [async] Disable inlining in async coroutine splitting (#80904)
The call to the inlining utility does not update the call graph. Leading
to assertion failures when calling the call graph utility to update the
call graph.

Instead rely on an inline pass to run after coro splitting and use
alwaysinline annotations.

github.com/apple/swift/issues/68708
2024-02-07 13:44:22 -08:00
Yuxuan Chen
fa70b5d130
[Coroutines][NFC] Refactor CoroSplit for Switch Resume ABI (#80758) 2024-02-05 21:43:49 -08:00
Stephen Tozer
632f44e5ed
[RemoveDIs][DebugInfo] Handle DPVAssign in most transforms (#78986)
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
2024-01-23 16:16:59 +00:00
yonillasky
9299ca797a
[Coroutines] Fix inline comment about frame layout (#78626)
`ResumeIndex` isn't part of the frame struct header, so it necessarily
appears after the promise.

Co-authored-by: Yoni Lavi <yoni.lavi@nextsilicon.com>
2024-01-19 09:46:15 +08:00
yonillasky
be690ea3db
[Coroutines] Fix incorrect attribute name coroutine.presplit (NFC) (#78296)
Those are probably leftovers from an old name of the same attribute.
Fixed for the sake of consistency.

Co-authored-by: Yoni Lavi <yoni.lavi@nextsilicon.com>
2024-01-16 17:35:07 +01:00
Stephen Tozer
304119860a
[DebugInfo][RemoveDIs][NFC] Split findDbgDeclares into two functions (#77478)
This patch follows on from comments on
https://github.com/llvm/llvm-project/pull/73498, implementing the
proposed split of findDbgDeclares into two separate functions for
DbgDeclareInsts and DPVDeclares, which return containers rather than
taking containers by reference.
2024-01-15 17:46:56 +00:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Wei Wang
0faf46befa
[coroutines][DPValue] Update DILocation in DPValue for hoisted dbg.declare (#76765)
Follow up #75402 to cover DPValue
2024-01-03 08:55:38 -08:00
Wei Wang
9c978c9418
[coroutines] Use DILocation from new storage for hoisted dbg.declare (#75402)
Make the hoisted dbg.declare inherent the DILocation scope from the new
storage.

After hoisting, the dbg.declare is moved into the block that defines the
new storage. This could create an inconsistency in the debug location
scope hierarchy where the scope of hoisted dbg.declare (i.e.
DILexicalBlock) is enclosed with the scope of the block (i.e.
DISubprogram). This confuses LiveDebugValues pass to think that the
hoisted dbg.declare is killed in that block and does not generate
DBG_VALUE in other blocks. Debugger won't be able to track its value
anymore.

We do this for unoptimized binary only.
2024-01-02 09:54:16 -08:00
Orlando Cazalet-Hyams
fd8fa31c55
[RemoveDIs] Update Coroutine passes to handle DPValues (#74480)
As part of the RemoveDIs project, transitioning to non-instruction debug
info, all debug intrinsic handling code needs to be duplicated to handle
DPValues.

--try-experimental-debuginfo-iterators enables the new debug mode in
tests if the CMake option has been enabled.

`getInsertPtAfterFramePtr` now returns an iterator so we don't lose
debug-info-communicating bits.

---

Depends on #73500, #74090, #74091.
2023-12-13 12:34:37 +00:00
Matheus Izvekov
2fb060efd8
Revert "[coroutines] Use DILocation from new storage for hoisted dbg.declare" (#75282)
Reverts llvm/llvm-project#75104

Original commit causes clang to generate invalid IR:
```
mismatched subprogram between llvm.dbg.declare variable and !dbg attachment
  call void @llvm.dbg.declare(metadata ptr %4, metadata !34468, metadata !DIExpression(DW_OP_plus_uconst, 176)), !dbg !34467
```
2023-12-13 06:33:13 +01:00
Wei Wang
31cf6df06f
[coroutines] Use DILocation from new storage for hoisted dbg.declare (#75104)
Make the hoisted dbg.declare inherent the DILocation scope from the new
storage.

After hoisting, the dbg.declare is moved into the block that defines the
new storage. This could create an inconsistency in the debug location
scope hierarchy where the scope of hoisted dbg.declare (i.e.
DILexicalBlock) is enclosed with the scope of the block (i.e.
DISubprogram). This confuses LiveDebugValues pass to think that the
hoisted dbg.declare is killed in that block and does not generate
DBG_VALUE in other blocks. Debugger won't be able to track its value
anymore.
2023-12-12 09:47:46 -08:00
Orlando Cazalet-Hyams
2d9d9a1a55
[NFC] Change FindDbgDeclareUsers interface to match findDbgUsers/values (#73498)
This simplifies an upcoming patch to support the RemoveDIs project (tracking
variable locations without using intrinsics).

Next in this series is #73500.
2023-12-12 09:43:58 +00:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
Youngsuk Kim
859338a695 [llvm] Replace uses of Type::getPointerTo (NFC)
Work towards removing method Type::getPointerTo.
Opaque ptr cleanup effort.
2023-11-29 10:22:31 -06:00
Youngsuk Kim
f42eb15c39
[llvm][Coroutines] Remove no-op ptr-to-ptr bitcasts (NFC) (#73427)
Opaque ptr cleanup effort
2023-11-26 09:22:12 -05:00
Craig Topper
aba040182a
[IR] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC (#73154) 2023-11-22 12:24:18 -08:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Youngsuk Kim
eed067e9fb [llvm] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort (NFC).
2023-11-13 14:33:41 -06:00
Chuanqi Xu
b7b5907b56
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close https://github.com/llvm/llvm-project/issues/56980.

This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.

The rationale behind the patch is simple. See the example:

```C++
A foo() {
  dtor d;
  co_await something();
  dtor d1;
  co_await something();
  dtor d2;
  co_return 43;
}
```

Generally the generated .destroy function may be:

```C++
void foo.destroy(foo.Frame *frame) {
  switch(frame->suspend_index()) {
    case 1:
      frame->d.~dtor();
      break;
    case 2:
      frame->d.~dtor();
      frame->d1.~dtor();
      break;
    case 3:
      frame->d.~dtor();
      frame->d1.~dtor();
      frame->d2.~dtor();
      break;
    default: // coroutine completed or haven't started
      break;
  }

  frame->promise.~promise_type();
  delete frame;
}
```

Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.

However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:

```C++
void foo.destroy(foo.Frame *frame) {
  frame->promise.~promise_type();
  delete frame;
}
```

I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
2023-11-09 14:42:07 +08:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
JOE1994
81ee059073 [llvm] Replace uses of Type::getPointerTo (NFC)
opaque pointer clean-up effort (NFC)
2023-10-05 10:08:38 -04:00
Youngsuk Kim
4346aaf05b [llvm] Remove uses of Type::getPointerTo() (NFC)
* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast
  (remove the bitcast as well)
* Replace with use of other APIs.

NFC opaque pointer cleanup effort.
2023-09-30 06:55:41 -04:00
Mikael Holmen
9284b03e87 [Coroutines] Remove unused variable [NFC]
gcc warned about it:
 ../lib/Transforms/Coroutines/CoroFrame.cpp:2785:15: warning: unused variable 'MD' [-Wunused-variable]
  2785 |   if (MDNode *MD = AI->getMetadata(LLVMContext::MD_coro_outside_frame))
       |               ^~

Fix the warning by removing the unused variable and change the call
from getMetadata to hasMetadata.
2023-09-29 11:59:23 +02:00
Ruiling Song
4c59f809c9 Coroutines: Cleanup typed pointer code in CoroFrame.cpp. NFC 2023-09-25 11:30:14 +08:00
Felipe de Azevedo Piovezan
eb6dee613e
[Corosplit][DebugInfo] Don't add EntryValue ops in variadic DIExpressions (#67179)
These are not supported by the backend.

The comment that got deleted was out of place, and it exists in the call
sites of this function.
2023-09-22 16:43:36 -04:00
Ruiling, Song
ed9b354379
Coroutines: Handle non-zero stack address space (#67092)
The stack might be in a different address space, in which case, bitcast
does not work. We should use addrspacecast. As we do not support typed
pointer anymore, so we do not need a bitcast here anymore.
2023-09-22 20:29:44 +08:00
Bruno Cardoso Lopes
34415fd611
[Clang][LLVM][Coroutines] Prevent __coro_gro from outliving __promise (#66706)
When dealing with short-circuiting coroutines (e.g. expected), the
deferred calls that resolve the get_return_object are currently being
emitted after we delete the coroutine frame.

This was caught by ASAN when using optimizations -O1 and above:
optimizations after inlining would place the __coro_gro in the heap, and
subsequent delete of the coroframe followed by the conversion -> BOOM.

This patch forbids the GRO to be placed in the coroutine frame, by
adding a new metadata node that can be attached to `alloca`
instructions.

Fix #49843
2023-09-21 22:52:05 -07:00
Anton Korobeynikov
51d5d7bbae
Extend retcon.once coroutines lowering to optionally produce a normal result (#66333)
One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.

However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).

The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
2023-09-15 09:54:38 -07:00
Jeremy Morse
e54277fa10 [NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that
takes a block/iterator instead of an instruction, and updates many call
sites to use it. The motivating reason for doing this is given here [0],
we'd like to pass around more information about the position of debug-info
in the iterator object. That necessitates passing iterators around most of
the time.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152468
2023-09-11 20:01:19 +01:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Nikita Popov
4eafc9b6ff [IR] Treat callbr as special terminator (PR64215)
isLegalToHoistInto() currently return true for callbr instructions.
That means that a callbr with one successor will be considered a
proper loop preheader, which may result in instructions that use
the callbr return value being hoisted past it.

Fix this by adding callbr to isExceptionTerminator (with a rename
to isSpecialTerminator), which also fixes similar assumptions in
other places.

Fixes https://github.com/llvm/llvm-project/issues/64215.

Differential Revision: https://reviews.llvm.org/D158609
2023-08-25 09:20:18 +02:00
Felipe de Azevedo Piovezan
fdb734a211 [Coroutines] Add missing TargetParser lib dependency
This dependency was introduced by D158638. It seems harmless to add this, as the
Analysis library also does it.

Differential Revision: https://reviews.llvm.org/D158729
2023-08-24 09:19:54 -04:00
Felipe de Azevedo Piovezan
aefa9ff3ec [CoroSplit][DebugInfo] Don't use entry_value for async args in 32-bit targets
Only X86_64 and ARM64 have a reserved register for async arguments, and so the
debugger is only able to handle those targets. For other architectures, we use a
non-entry-value expression and let the debugger do its best with that.

Differential Revision: https://reviews.llvm.org/D158638
2023-08-24 08:50:12 -04:00
Chuanqi Xu
7037331a2f [Coroutines] [CoroElide] Don't think exceptional terminator don't leak coro handle unconditionally any more
Close https://github.com/llvm/llvm-project/issues/59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
2023-08-23 16:51:53 +08:00
Felipe de Azevedo Piovezan
8aa038ab17 [CoroSplit][DebugInfo] Don't use entry_value in coroutine entry point
The entry point function is called as a regular function. Among other things, it
can be inlined, which would violate the semantics of entry_value in the IR.

Differential Revision: https://reviews.llvm.org/D158108
2023-08-17 09:15:17 -04:00
Bjorn Pettersson
e53b28c833 [llvm] Drop some bitcasts and references related to typed pointers
Differential Revision: https://reviews.llvm.org/D157551
2023-08-10 15:07:07 +02:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Wei Wang
f55e57ca0c [Coroutines][NFC] Only look for Alloca or Load when finding dbg.declare for temp spills
We only look for Alloca or Load in this case, so making it explicit.

Reviewed By: ChuanqiXu, MatzeB

Differential Revision: https://reviews.llvm.org/D157423
2023-08-09 00:19:43 -07:00
witstorm95
132bb5cc5f [NFC][Coroutines] Use a reverse post-order to guide the computation about cross suspend infomation to reach a fixed point faster.
Fixed https://github.com/llvm/llvm-project/issues/62348

Propagate cross suspend point information along reverse post-order.
It does not modify the original function, just selects a better
traversal order.

Before the patch:

```
n: 20000
4.31user 0.11system 0:04.44elapsed 99%CPU (0avgtext+0avgdata
552352maxresident)k
0inputs+8848outputs (0major+126254minor)pagefaults 0swaps

n: 40000
11.24user 0.40system 0:11.66elapsed 99%CPU (0avgtext+0avgdata
1788404maxresident)k
0inputs+17600outputs (0major+431105minor)pagefaults 0swaps

n: 60000
21.65user 0.96system 0:22.62elapsed 99%CPU (0avgtext+0avgdata
3809836maxresident)k
0inputs+26352outputs (0major+934749minor)pagefaults 0swaps

n: 80000
37.05user 1.53system 0:38.58elapsed 99%CPU (0avgtext+0avgdata
6602396maxresident)k
0inputs+35096outputs (0major+1622584minor)pagefaults 0swaps

n: 100000
51.87user 2.67system 0:54.54elapsed 99%CPU (0avgtext+0avgdata
10210736maxresident)k
0inputs+43848outputs (0major+2518945minor)pagefaults 0swaps

```
After the patch:

```
n: 20000
3.08user 0.12system 0:03.21elapsed 99%CPU (0avgtext+0avgdata
551012maxresident)k
0inputs+8848outputs (0major+129349minor)pagefaults 0swaps

n: 40000
5.88user 0.33system 0:06.22elapsed 99%CPU (0avgtext+0avgdata
1789248maxresident)k
0inputs+17600outputs (0major+435096minor)pagefaults 0swaps

n: 60000
8.84user 0.77system 0:09.63elapsed 99%CPU (0avgtext+0avgdata
3807800maxresident)k
0inputs+26352outputs (0major+939119minor)pagefaults 0swaps

n: 80000
11.64user 1.58system 0:13.23elapsed 99%CPU (0avgtext+0avgdata
6604708maxresident)k
0inputs+35096outputs (0major+1629566minor)pagefaults 0swaps

n: 100000
15.21user 2.56system 0:17.79elapsed 99%CPU (0avgtext+0avgdata
10208828maxresident)k
8inputs+43848outputs (0major+2526611minor)pagefaults 0swaps

```

Reviewed By: MatzeB, ChuanqiXu

Differential Revision: https://reviews.llvm.org/D156850
2023-08-09 11:21:57 +08:00