513 Commits

Author SHA1 Message Date
Thurston Dang
2672947633 Revert "[Coroutines] ABI Objects to improve code separation between different ABIs, users and utilities. (#109338)"
This reverts commit 2e414799d0ad511cd7999895014a2cae2ea5e3e3.

Reason: buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/51/builds/4105)
(This was the only new CL.)

/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Transforms/Coroutines/ABI.h:71:31: error: 'llvm::coro::AsyncABI' has virtual functions but non-virtual destructor [-Werror,-Wnon-virtual-dtor]
   71 | class LLVM_LIBRARY_VISIBILITY AsyncABI : public BaseABI {

   etc.
2024-09-20 21:23:04 +00:00
Tyler Nowicki
2e414799d0
[Coroutines] ABI Objects to improve code separation between different ABIs, users and utilities. (#109338)
* Adds an ABI object class hierarchy to implement the coroutine ABIs
(Switch, Asyc, and Retcon{Once})
* The ABI object improves the separation of the code related to users,
ABIs and utilities.
* No code changes are required by any existing users.
* Each ABI overrides delegate methods for initialization, building the
coroutine frame and splitting the coroutine, other methods may be added
later.
* CoroSplit invokes a generator lambda to instantiate the ABI object and
calls the ABI object to carry out its primary operations.

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-20 16:47:17 -04:00
Yuxuan Chen
b5cdb03971
[Coroutines] Better optimization remarks for CoroAnnotationElide pass (#109352) 2024-09-19 19:16:10 -07:00
Kazu Hirata
3191587666 [Coroutines] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Coroutines/Coroutines.cpp:475:3: error: default
  label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]
2024-09-19 13:08:56 -07:00
Tyler Nowicki
d57525294a
[Coroutines] Refactor CoroShape::buildFrom for future use by ABI objects (#108623)
* Refactor buildFrom to separate the analysis, abi related operations,
  cleaning and invalidating.
* In a follow-up PR the code in initABI will be moved to an ABI object
  init method.
* In a follow-up PR the OptimizeFrame flag will also be removed from the
  Shape and instead will be passed directly to buildCoroutineFrame
  (although it would be nice to find another way to trigger this
  optimization). This is the only thing that Shape cannot determine from
  the Function/Coroutine, but it is only needed within
  buildCoroutineFrame.

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-19 14:06:43 -04:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Felipe de Azevedo Piovezan
ddcc601353
[CoroSplit][DebugInfo] Adjust heuristic for moving DIScope of funclets (#108611)
CoroSplit has a heuristic where the scope line for funclets is adjusted
to match the line of the suspend intrinsic that caused the split. This
is useful as it avoids a jump on the line table from the original
function declaration to the line where the split happens.

However, very often using the line of the split is not ideal: if we can
avoid it, we should not have a line entry for the split location, as
this would cause breakpoints by line to match against two functions: the
funclet before and the funclet after the split.

This patch adjusts the heuristics to look for the first instruction with
a non-zero line number after the split. In other words, this patch makes
breakpoints on `await foo()` lines behave much more like a regular
function call.
2024-09-13 15:25:11 -07:00
Tyler Nowicki
4c040c0275
[Coroutines] Move Shape to its own header (#108242)
* To create custom ABIs plugin libraries need access to CoroShape.
* As a step in enabling plugin libraries, move Shape into its own header
* The header will eventually be moved into include/llvm/Transforms/Coroutines

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-13 14:11:30 -04:00
Tyler Nowicki
2670565afc
[Coroutines] Move materialization code into its own utils (#108240)
* Move materialization out of CoroFrame to MaterializationUtils.h
* Move spill related utilities that were used by materialization to
SpillUtils
* Move isSuspendBlock (needed by materialization) to CoroInternal

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-12 14:01:23 -04:00
Tyler Nowicki
0989a775ae
[Coroutines] Verify normalization was not missed (#108096)
* Add asserts to verify normalization of the coroutine happened.
* This will be important when normalization becomes part of an ABI
object.

--- From a previous discussion here
https://github.com/llvm/llvm-project/pull/108076

Normalization performs these important steps:

split around each suspend, this adds BBs before/after each suspend so
each suspend now exists in its own block.
split around coro.end (similar to above)
break critical edges and add single-edge phis in the new blocks (also
removing other single-edge phis).
Each of these things can individually be tested
A) Check that each suspend is the only inst in its BB
B) Check that coro.end is the only inst in its BB
C) Check that each edge of a multi-edge phis is preceded by single-edge
phi in an immediate pred

For 1) and 2) I believe the purpose of the transform is in part for
suspend crossing info's analysis so it can specifically 'mark' the
suspend blocks and identify the end of the coroutine. There are some
existing places within suspend crossing info that visit the CoroSuspends
and CoroEnds so we could check A) and B) there.

For 3) I believe the purpose of this transform is for insertSpills to
work properly. Infact there is already a check for the result of this
transform!

assert(PN->getNumIncomingValues() == 1 &&
           "unexpected number of incoming "
           "values in the PHINode");

I think to verify the result of normalization we just need to add checks
A) and B) to suspend crossing info.
2024-09-12 14:00:49 -04:00
Yuxuan Chen
853bff2122
[Coroutines] properly update CallGraph in CoroSplit (#107935)
Fixes https://github.com/llvm/llvm-project/issues/107139.

We weren't updating the call graph properly in CoroSplit. This crash is
due to the await_suspend() function calling the coroutine, forming a
multi-node SCC. The issue bisected to
https://github.com/llvm/llvm-project/pull/79712 but I think this is red
herring. We haven't been properly updating the call graph.

Added an example of such code as a test case.
2024-09-12 10:45:20 -07:00
Tyler Nowicki
9a9f155df1
[Coroutines] Split buildCoroutineFrame into normalization and frame building (#108076)
* Split buildCoroutineFrame into code related to normalization and code
related to actually building the coroutine frame.
* This will enable future specialization of buildCoroutineFrame for
different ABIs while the normalization can be done by splitCoroutine
prior to calling buildCoroutineFrame.

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-11 10:29:06 -04:00
Tyler Nowicki
f4e2d7bfc1
[Coroutines] Move spill related methods to a Spill utils (#107884)
* Move code related to spilling into SpillUtils to help cleanup
CoroFrame

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-10 15:43:57 -04:00
Shubham Sandeep Rastogi
7a91af4f87
Add DIExpression::foldConstantMath to CoroSplit (#107933)
The CoroSplit pass has it's own salvageDebugInfo implementation and it's
DIExpressions do not get folded. Add a call to
DIExpression::foldConstantMath in the CoroSplit pass to reduce the size
of those DIExpressions.

[The compile time tracker shows no significant increase in compile time
either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u)

rdar://134675402
2024-09-10 11:27:01 -07:00
Yuxuan Chen
761bf333e3
[LLVM][Coroutines] Switch CoroAnnotationElidePass to a FunctionPass (#107897)
After landing https://github.com/llvm/llvm-project/pull/99285 we found
that the call graph update was causing the following crash when
expensive checks are turned on
```
llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:982: LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(LazyCallGraph &, LazyCallGraph::SCC &, LazyCallGraph::Node &, CGSCCAnalysisManager &, CGSCCUpdateResult &, FunctionAnalysisManager &, bool): Assertion `(RC == &TargetRC || RC->isAncestorOf(Targe
tRC)) && "New call edge is not trivial!"' failed.                                                                                                                                                                                                                                                                               
```
I have to admit I believe that the call graph update process I did for
that patch could be wrong.

After reading the code in `CGSCCToFunctionPassAdaptor`, I am convinced
that `CoroAnnotationElidePass` can be a FunctionPass and rely on the
adaptor to update the call graph for us, so long as we properly
invalidate the caller's analyses.

After this patch,
`llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll` no longer
fails under expensive checks.
2024-09-09 18:57:39 -07:00
Tyler Nowicki
ea2da571c7
[Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source (#106306)
* Move the SuspendCrossingInfo analysis helper into its own
header/source

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

Co-authored-by: tnowicki <tnowicki.nowicki@amd.com>
2024-09-09 11:50:27 -04:00
Yuxuan Chen
a416267a5f
[LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the noalloc variant (#99285)
This patch is episode three of the middle end implementation for the
coroutine HALO improvement project published on discourse:
https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044

After we attribute the calls to some coroutines as "coro_elide_safe" in
the C++ FE and creating a `noalloc` ramp function, we use a new middle
end pass to move the call to coroutines to the noalloc variant.

This pass should be run after CoroSplit. For each node we process in
CoroSplit, we look for its callers and replace the attributed ones in
presplit coroutines to the noalloc one. The transformed `noalloc` ramp
function will also require a frame pointer to a block of memory it can
use as an activation frame. We allocate this on the caller's frame with
an alloca.

Please note that we cannot safely transform such attributed calls in
post-split coroutines due to memory lifetime reasons. The CoroSplit pass
is responsible for creating the coroutine frame spills for all the
allocas in the coroutine. Therefore it will be unsafe to create new
allocas like this one in post-split coroutines. This happens relatively
rarely because CGSCC performs the passes on the callees before the
caller. However, if multiple coroutines coexist in one SCC, this
situation does happen (and prevents us from having potentially unbound
frame size due to recursion.)

You can find episode 1: Clang FE of this patch series at
https://github.com/llvm/llvm-project/pull/99282
Episode 2: CoroSplit at https://github.com/llvm/llvm-project/pull/99283
2024-09-08 23:09:40 -07:00
Yuxuan Chen
234cc81625
[LLVM][Coroutines] Create .noalloc variant of switch ABI coroutine ramp functions during CoroSplit (#99283)
This patch is episode two of the coroutine HALO improvement project
published on discourse:
https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044

Previously CoroElide depends on inlining, and its analysis does not work
very well with code generated by the C++ frontend due the existence of
many customization points. There has been issue reported to upstream how
ineffective the original CoroElide was in real world applications.

For C++ users, this set of patches aim to fix this problem by providing
library authors and users deterministic HALO behaviour for some
well-behaved coroutine `Task` types. The stack begins with a library
side attribute on the `Task` class that guarantees no unstructured
concurrency when coroutines are awaited directly with `co_await`ed as a
prvalue. This attribute on Task types gives us lifetime guarantees and
makes C++ FE capable to telling the ME which coroutine calls are
elidable. We convey such information from FE through the attribute
`coro_elide_safe`.

This patch modifies CoroSplit to create a variant of the coroutine ramp
function that 1) does not use heap allocated frame, instead take an
additional parameter as the pointer to the frame. Such parameter is
attributed with `dereferenceble` and `align` to convey size and align
requirements for the frame. 2) always stores cleanup instead of destroy
address for `coro.destroy()` actions.

In a later patch, we will have a new pass that runs right after
CoroSplit to find usages of the callee coroutine attributed
`coro_elide_safe` in presplit coroutine callers, allocates the frame on
its "stack", transform those usages to call the `noalloc` ramp function
variant.

(note I put quotes on the word "stack" here, because for presplit
coroutine, any alloca will be spilled into the frame when it's being
split)

The C++ Frontend attribute implementation that works with this change
can be found at https://github.com/llvm/llvm-project/pull/99282
The pass that makes use of the new `noalloc` split can be found at
https://github.com/llvm/llvm-project/pull/99285
2024-09-08 23:09:20 -07:00
Chuanqi Xu
07514fa9b6 [Coroutines] Salvage the debug information for coroutine frames within optimizations
This patch tries to salvage the debug information for the coroutine
frames within optimizations by creating the help alloca varaibles with
optimizations too. We didn't do this when I implement it initially. I
roughtly remember the reason was, we feel the additional help alloca
variable may pessimize the performance, which is almost the most
important thing under optimizations. But now, it looks like the new
inserted help alloca variables can be optimized out by the following
optimizations. So it looks like the time to make it available within
optimizations.

And also, it looks like the following optimizations will convert the
generated dbg.declare instrinsic into dbg.value intrinsic within
optimizations.

In LLVM's test, there is a slightly regression
that a dbg.declare for the promise object failed to be remained after
this change. But it looks like we won't have a chance to see dbg.declare
for the promise object when we split the coroutine as that dbg.declare
will be converted into a dbg.value in early stage.

So everything looks fine.
2024-08-28 17:02:12 +08:00
Tyler Nowicki
51aceb5b30
[llvm/llvm-project][Coroutines] Improve debugging and minor refactoring (#104642)
No Functional Changes

* Fix comments in several places
* Instead of using BB-getName() (in dump methods) use
getBasicBlockLabel. This fixes the poor output of the dumped info that
resulted in missing BB labels.
* Use RPO when dumping SuspendCrossingInfo. Without this the dump order
is determined by the ptr addresses and so is not consistent from run to
run making IR diffs difficult to read.
* Inference -> Interference
* Pull the logic that determines insertion location out of insertSpills
and into getSpillInsertionPt, to differentiate between these two
operations.
* Use Shape getters for CoroId instead of getting it manually.

---------

Co-authored-by: tnowicki <tnowicki.nowicki@amd.com>
2024-08-27 18:55:33 -04:00
Chuanqi Xu
b10303730d
[Coroutines] [NFCI] Don't search the DILocalVariable for __promise when constructing the debug varaible for __coro_frame (#105626)
As the title mentioned, do not search for the DILocalVariable for
__promise when constructing the debug variable for __coro_frame.

This should make sense because the debug variable of `__coro_frame`
shouldn't dependent on the debug variable of `__promise`. And actually,
it is not. Currently, we search the debug variable for `__promise` only
because we want to get the debug location and the debug scope for the
`__promise`. However, we can construct the debug location directly from
the debug scope of the being compiled function. Then it is not necessary
any more to search the `__promise` variable.

And this patch makes the codes to construct the debug variable for
`__coro_frame` to be more stable. Now we will always be able to
construct the debug variable for the coroutine frame no matter if we
found the debug variable for the __promise or not.

This patch is not strictly NFC but it is intended to not affect any end
users.
2024-08-26 09:43:23 +08:00
Dmitri Gribenko
5c7ae42c52 Revert "[Coroutines] [NFCI] Don't search the DILocalVariable for __promise when constructing the debug varaible for __coro_frame"
This reverts commit 08a0dece2b2431db8abe650bb43cba01e781e1ce.

This series of commits causes Clang crashes. The reproducer is posted on
08a0dece2b.
2024-08-21 23:50:46 +02:00
Dmitri Gribenko
dc12ccd13f Revert "[Coroutines] Fix -Wunused-variable in CoroFrame.cpp (NFC)"
This reverts commit d48b807aa8abd1cbfe8ac5d1ba27b8b3617fc5e6.

This series of commits causes Clang crashes. The reproducer is posted on
08a0dece2b
2024-08-21 23:50:19 +02:00
Dmitri Gribenko
f709cd5add Revert "[Coroutines] Salvage the debug information for coroutine frames within optimizations"
This reverts commit 522c253f47ea27d8eeb759e06f8749092b1de71e.

This series of commits causes Clang crashes. The reproducer is posted on
08a0dece2b.
2024-08-21 23:49:45 +02:00
Chuanqi Xu
522c253f47 [Coroutines] Salvage the debug information for coroutine frames within optimizations
This patch tries to salvage the debug information for the coroutine
frames within optimizations by creating the help alloca varaibles with
optimizations too. We didn't do this when I implement it initially. I
roughtly remember the reason was, we feel the additional help alloca
variable may pessimize the performance, which is almost the most
important thing under optimizations. But now, it looks like the new
inserted help alloca variables can be optimized out by the following
optimizations. So it looks like the time to make it available within
optimizations.

And also, it looks like the following optimizations will convert the
generated dbg.declare instrinsic into dbg.value intrinsic within
optimizations.

In LLVM's test, there is a slightly regression
that a dbg.declare for the promise object failed to be remained after
this change. But it looks like we won't have a chance to see dbg.declare
for the promise object when we split the coroutine as that dbg.declare
will be converted into a dbg.value in early stage.

So everything looks fine.
2024-08-20 17:21:43 +08:00
Jie Fu
d48b807aa8 [Coroutines] Fix -Wunused-variable in CoroFrame.cpp (NFC)
/llvm-project/llvm/lib/Transforms/Coroutines/CoroFrame.cpp:1124:15:
error: unused variable 'PromiseAlloca' [-Werror,-Wunused-variable]
  AllocaInst *PromiseAlloca = Shape.getPromiseAlloca();
              ^
1 error generated.
2024-08-20 15:38:56 +08:00
Chuanqi Xu
08a0dece2b [Coroutines] [NFCI] Don't search the DILocalVariable for __promise when constructing the debug varaible for __coro_frame
As the title mentioned, do not search for the DILocalVariable for
__promise when constructing the debug variable for __coro_frame.

This should make sense because the debug variable of `__coro_frame`
shouldn't dependent on the debug variable of `__promise`. And actually,
it is not. Currently, we search the debug variable for `__promise` only
because we want to get the debug location and the debug scope for the
`__promise`. However, we can construct the debug location directly from
the debug scope of the being compiled function. Then it is not necessary
any more to search the `__promise` variable.

And this patch makes the codes to construct the debug variable for
`__coro_frame` to be more stable. Now we will always be able to
construct the debug variable for the coroutine frame no matter if we
found the debug variable for the __promise or not.

This patch is not strictly NFC but it is intended to not affect any end
users.
2024-08-20 15:24:35 +08:00
Kazu Hirata
b7146aed5b
[Transforms] Construct SmallVector with ArrayRef (NFC) (#101851) 2024-08-03 15:33:08 -07:00
Wei Wang
bee2654300
[Asan] Skip pre-split coroutine and noop coroutine frame (#99415)
CoroSplit expects the second parameter of `llvm.coro.id` to be the
promise alloca. Applying Asan on a pre-split coroutine breaks this
assumption and causes split to fail. This should be NFC because asan
pass happens late in the pipeline where all coroutines are split. This
is to prevent crash in case the order of passes are switched.

Also `NoopCoro.Frame.Const` is a special coroutine frame that does
nothing when resumed or destroyed. There is no point to do
instrumentation on it.
2024-07-22 10:53:40 -07:00
Yuxuan Chen
4f1de83d53
[Clang][NFC][Coroutine] Do not leave dangling pointers after removing CoroBegin (#98546) 2024-07-11 19:46:35 -07:00
Yuxuan Chen
0483f14b00
[NFC][Coroutines] Remove redundant checks for replacing PrepareFns (#98392)
If `Coroutines.empty()` the following loop is going to be skipped
entirely and same goes for `PrepareFns.empty()`. These two conditions
here aren't useful and adds to complexity.
2024-07-10 22:22:45 -07:00
Nikita Popov
74deadf196
[IRBuilder] Don't include Module.h (NFC) (#97159)
This used to be necessary to fetch the DataLayout, but isn't anymore.
2024-06-29 15:05:04 +02:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Nikita Popov
49ae2dcf36
[PassManager] Remove some unnecessary includes (NFC) (#96175)
SmallPtrSet.h and TimeProfiler.h are unused. CommandLine.h is only
needed for the UseNewDbgInfoFormat declare, which can be moved to the
places that need it.
2024-06-20 17:41:35 +02:00
Ruiling, Song
f0b57b60e3
[Coroutines] Remove one construction of DominatorTree (#93507)
The DominatorTree can be reused if no CFG changes.
2024-05-29 08:58:19 +08:00
Billy Zhu
910292c3ac
[LLVM][Coroutines] Check variable decl scope instead of optimization level for hoisted DbgDeclare Loc (#92978)
Minor patch following up on
https://github.com/llvm/llvm-project/pull/75402.

The more generalized version of [this
error](https://github.com/llvm/llvm-project/pull/75104#issuecomment-1853497609)
is whenever we have a debug variable created in one subprogram scope
inlined into another subprogram scope. So instead of checking
optimization level, it is safer to just check whether the subprogram
scope of the variable matches the subprogram scope of the hoisted
position.
2024-05-23 10:01:37 -07:00
Chuanqi Xu
44430de0f1
[Coroutines] Always set the calling convention of generated resuming call from 'llvm.coro.await.suspend.handle' as fast (#93167)
See the post commit message in
https://github.com/llvm/llvm-project/pull/89751

We met a regression due to a change of calling convention of this patch.
Previously, the calling convention of indirect resume calls is always
fast.

And in this patch, although we tried to take care of it in the cloner,
we forget the case that we have to update the resuming calls in the ramp
functions. So this is the root cause of the downstream failure.

This patch tries to mark the generated resuming calls as fast
immediately after they got created to make sure the calling convention
is correct.
2024-05-23 22:32:54 +08:00
Chuanqi Xu
8d4d85de1e Revert "[Coroutines] Always set the calling convention of generated resuming call from 'llvm.coro.await.suspend.handle' as fast"
This reverts commit 31f1590e4fb324c43dc36199587c453e27b6f6fa.

It looks like some bots are not happy about the FileChecks
2024-05-23 18:31:55 +08:00
Chuanqi Xu
31f1590e4f [Coroutines] Always set the calling convention of generated resuming call from 'llvm.coro.await.suspend.handle' as fast
See the post commit message in
https://github.com/llvm/llvm-project/pull/89751

We met a regression due to a change of calling convention of this patch.
Previously, the calling convention of indirect resume calls is always
fast.

And in this patch, although we tried to take care of it in the cloner,
we forget the case that we have to update the resuming calls in the
ramp functions. So this is the root cause of the downstream failure.

This patch tries to mark the generated resuming calls as fast
immediately after they got created to make sure the calling convention
is correct.
2024-05-23 18:12:57 +08:00
Yuxuan Chen
c2e0afe95e
[Coroutines][NFC] Remove @llvm.coro.id.async intrinsics from CoroElide (#92956) 2024-05-21 18:57:45 -07:00
Alan Zhao
0c8bc08868
Reapply "[coro][CoroSplit] Use llvm.lifetime.end to compute putting objects on the frame vs the stack (#90265) (#91372)
This reverts commit 924384161ffceda08099536dd07a953299a69b53.

This reland addresses the performance regressions seen in #90265 by
retaining the original definition of
`isPotentiallyReachableFromMany(...)` instead of reimplementing it with
`isManyPotentiallyReachableFromMany(...)`.

Fixes #86580
2024-05-21 09:48:03 -07:00
Jie Fu
7c8176ebd3 [Coroutines] Remove unused function (NFC)
llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:1223:1:
error: unused function 'scanPHIsAndUpdateValueMap' [-Werror,-Wunused-function]
scanPHIsAndUpdateValueMap(Instruction *Prev, BasicBlock *NewBlock,
^
1 error generated.
2024-05-15 22:08:17 +08:00
Hans
3bb39690d7
[coro] Lower llvm.coro.await.suspend.handle to resume with tail call (#89751)
The C++ standard requires that symmetric transfer from one coroutine to
another is performed via a tail call. Failure to do so is a miscompile
and often breaks programs by quickly overflowing the stack.

Until now, the coro split pass tried to ensure this in the
`addMustTailToCoroResumes()` function by searching for
`llvm.coro.resume` calls to lower as tail calls if the conditions were
right: the right function arguments, attributes, calling convention
etc., and if a `ret void` was sure to be reached after traversal with
some ad-hoc constant folding following the call.

This was brittle, as the kind of implicit variants required for a tail
call to happen could easily be broken by other passes (e.g. if some
instruction got in between the `resume` and `ret`), see for example
9d1cb18d19862fc0627e4a56e1e491a498e84c71 and
284da049f5feb62b40f5abc41dda7895e3d81d72.

Also the logic seemed backwards: instead of searching for possible tail
call candidates and doing them if the circumstances are right, it seems
better to start with the intention of making the tail calls we need, and
forcing the circumstances to be right.

Now that we have the `llvm.coro.await.suspend.handle` intrinsic (since
f78688134026686288a8d310b493d9327753a022) which corresponds exactly to
symmetric transfer, change the lowering of that to also include the
`resume` part, always lowered as a tail call.
2024-05-15 15:29:08 +02:00
Yuxuan Chen
87235fa9af
[NFC] CoroElide: Refactor Lowerer into CoroIdElider (#91539)
This patch contains no functional changes. 

The main goal of this patch is to get better clarity out of the code, to
make intentions and assumptions clear.

One major design problem I had in the past were `Lowerer`. It previously
inherited from `coro::LowererBase` but it doesn't use any of the fields
or methods from `LowererBase`. It might be an artifact leftover from
previous designs of this code.

Furthermore, we should clarify that although one such instance is bound
to the function, `Lowerer` was dedicated to one `CoroId` instruction at
a time. We rely on a sequence of fragile constructs like
`CoroBegins.clear(); DestroyAddr.clear()`. This doesn't help understand
the code.

What's worse is that we have confusing calls like
`elideHeapAllocations(CoroId->getFunction(), ...` and it might get
confused with `CoroId->getCoroutine()`.

The new structure intends to make it clear that we always operate on one
`CoroId` at a time, which may have multiple `CoroBegin`s. Such structure
doesn't rely on frequent `.clear()` that's prone to miss.
2024-05-09 08:21:40 -07:00
Nikita Popov
924384161f Revert "[coro][CoroSplit] Use llvm.lifetime.end to compute putting objects on the frame vs the stack (#90265)"
This reverts commit fcf341d3ddfe2289ac88aa3c122b25df8732cc8e.

Causes major compile-time regressions when not using coroutines.
2024-05-07 09:41:25 +09:00
Alan Zhao
fcf341d3dd
[coro][CoroSplit] Use llvm.lifetime.end to compute putting objects on the frame vs the stack (#90265)
The current logic for using lifetime intrinsics to determine whether a
coroutine alloca should live on the coroutine frame or stack doesn't
consider `llvm.lifetime.end`. As a result, some allocas are incorrectly
placed on the stack even though their lifetimes may outlive the stack.
For example, SimplifyCFG may generate code that drops the corresponding
`llvm.lifetime.end` of an `llvm.lifetime.start`, and that code is
incorrectly handled by the existing logic.

To fix this, new logic is introduced where if an alloca's address is
escaped, and there is a path from an `llvm.lifetime.start` to a
coroutine suspend point (e.g. `llvm.coro.suspend`) without an
`llvm.lifetime.end`, then we know the object lives beyond the suspension
point and therefore must go on the coroutine frame.

Fixes https://github.com/llvm/llvm-project/issues/86580
2024-05-06 13:43:52 -07:00
Kazu Hirata
677dddebae
[Transforms] Use StringRef::operator== instead of StringRef::equals (NFC) (#91072)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator==/!= outnumber StringRef::equals by a factor of
  31 under llvm/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-04 12:33:12 -07:00