llvm-project

Author	SHA1	Message	Date
Yuxuan Chen	e1b40dc063	[Clang] Propagate elide safe context through [[clang::coro_await_elidable_argument]] (#108474 )	2024-09-17 22:58:21 -07:00
Yuxuan Chen	e17a39bc31	[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 ) This patch is the frontend implementation of the coroutine elide improvement project detailed in this discourse post: https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044 This patch proposes a C++ struct/class attribute `[[clang::coro_await_elidable]]`. This notion of await elidable task gives developers and library authors a certainty that coroutine heap elision happens in a predictable way. Originally, after we lower a coroutine to LLVM IR, CoroElide is responsible for analysis of whether an elision can happen. Take this as an example: ``` Task foo(); Task bar() { co_await foo(); } ``` For CoroElide to happen, the ramp function of `foo` must be inlined into `bar`. This inlining happens after `foo` has been split but `bar` is usually still a presplit coroutine. If `foo` is indeed a coroutine, the inlined `coro.id` intrinsics of `foo` is visible within `bar`. CoroElide then runs an analysis to figure out whether the SSA value of `coro.begin()` of `foo` gets destroyed before `bar` terminates. `Task` types are rarely simple enough for the destroy logic of the task to reference the SSA value from `coro.begin()` directly. Hence, the pass is very ineffective for even the most trivial C++ Task types. Improving CoroElide by implementing more powerful analyses is possible, however it doesn't give us the predictability when we expect elision to happen. The approach we want to take with this language extension generally originates from the philosophy that library implementations of `Task` types has the control over the structured concurrency guarantees we demand for elision to happen. That is, the lifetime for the callee's frame is shorter to that of the caller. The ``[[clang::coro_await_elidable]]`` is a class attribute which can be applied to a coroutine return type. When a coroutine function that returns such a type calls another coroutine function, the compiler performs heap allocation elision when the following conditions are all met: - callee coroutine function returns a type that is annotated with ``[[clang::coro_await_elidable]]``. - In caller coroutine, the return value of the callee is a prvalue that is immediately `co_await`ed. From the C++ perspective, it makes sense because we can ensure the lifetime of elided callee cannot exceed that of the caller if we can guarantee that the caller coroutine is never destroyed earlier than the callee coroutine. This is not generally true for any C++ programs. However, the library that implements `Task` types and executors may provide this guarantee to the compiler, providing the user with certainty that HALO will work on their programs. After this patch, when compiling coroutines that return a type with such attribute, the frontend checks that the type of the operand of `co_await` expressions (not `operator co_await`). If it's also attributed with `[[clang::coro_await_elidable]]`, the FE emits metadata on the call or invoke instruction as a hint for a later middle end pass to elide the elision. The original patch version is https://github.com/llvm/llvm-project/pull/94693 and as suggested, the patch is split into frontend and middle end solutions into stacked PRs. The middle end CoroSplit patch can be found at https://github.com/llvm/llvm-project/pull/99283 The middle end transformation that performs the elide can be found at https://github.com/llvm/llvm-project/pull/99285	2024-09-08 23:08:58 -07:00
Chuanqi Xu	07514fa9b6	[Coroutines] Salvage the debug information for coroutine frames within optimizations This patch tries to salvage the debug information for the coroutine frames within optimizations by creating the help alloca varaibles with optimizations too. We didn't do this when I implement it initially. I roughtly remember the reason was, we feel the additional help alloca variable may pessimize the performance, which is almost the most important thing under optimizations. But now, it looks like the new inserted help alloca variables can be optimized out by the following optimizations. So it looks like the time to make it available within optimizations. And also, it looks like the following optimizations will convert the generated dbg.declare instrinsic into dbg.value intrinsic within optimizations. In LLVM's test, there is a slightly regression that a dbg.declare for the promise object failed to be remained after this change. But it looks like we won't have a chance to see dbg.declare for the promise object when we split the coroutine as that dbg.declare will be converted into a dbg.value in early stage. So everything looks fine.	2024-08-28 17:02:12 +08:00
Dmitri Gribenko	f709cd5add	Revert "[Coroutines] Salvage the debug information for coroutine frames within optimizations" This reverts commit 522c253f47ea27d8eeb759e06f8749092b1de71e. This series of commits causes Clang crashes. The reproducer is posted on `08a0dece2b`.	2024-08-21 23:49:45 +02:00
Chuanqi Xu	522c253f47	[Coroutines] Salvage the debug information for coroutine frames within optimizations This patch tries to salvage the debug information for the coroutine frames within optimizations by creating the help alloca varaibles with optimizations too. We didn't do this when I implement it initially. I roughtly remember the reason was, we feel the additional help alloca variable may pessimize the performance, which is almost the most important thing under optimizations. But now, it looks like the new inserted help alloca variables can be optimized out by the following optimizations. So it looks like the time to make it available within optimizations. And also, it looks like the following optimizations will convert the generated dbg.declare instrinsic into dbg.value intrinsic within optimizations. In LLVM's test, there is a slightly regression that a dbg.declare for the promise object failed to be remained after this change. But it looks like we won't have a chance to see dbg.declare for the promise object when we split the coroutine as that dbg.declare will be converted into a dbg.value in early stage. So everything looks fine.	2024-08-20 17:21:43 +08:00
Hari Limaye	94473f4db6	[IRBuilder] Generate nuw GEPs for struct member accesses (#99538 ) Generate nuw GEPs for struct member accesses, as inbounds + non-negative implies nuw. Regression tests are updated using update scripts where possible, and by find + replace where not.	2024-08-09 13:25:04 +01:00
Dmitri Gribenko	533a22941e	[clang][test] Write temporary files to %t The issue was introduced in `3a9ef4e69a`.	2024-07-30 10:26:31 +02:00
Wei Wang	3a9ef4e69a	[Pipelines] Do not run CoroSplit and CoroCleanup in LTO pre-link pipeline (#100205 ) This is re-land of #90310 after making asan skip pre-split coroutines in #99415. Skip CoroSplit and CoroCleanup in LTO pre-link pipeline so that CoroElide can happen after callee coroutine is imported into caller's module in ThinLTO.	2024-07-29 17:42:01 -07:00
Nikita Popov	12d24e0c56	[CodeGen] Simplify codegen for array initialization (#93956 ) This makes codegen for array initialization simpler in two ways: 1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers. 2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from #93823.	2024-06-10 09:19:55 +02:00
Pengcheng Wang	130e93cc26	Reland "[clang] Enable sized deallocation by default in C++14 onwards" (#90373 ) Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. This is another try of https://reviews.llvm.org/D112921. The original commit cf5a8b4 was reverted by 2e5035a due to some failures (see #83774). Fixes #60061	2024-05-22 12:37:27 +08:00
Hans	3bb39690d7	[coro] Lower `llvm.coro.await.suspend.handle` to resume with tail call (#89751 ) The C++ standard requires that symmetric transfer from one coroutine to another is performed via a tail call. Failure to do so is a miscompile and often breaks programs by quickly overflowing the stack. Until now, the coro split pass tried to ensure this in the `addMustTailToCoroResumes()` function by searching for `llvm.coro.resume` calls to lower as tail calls if the conditions were right: the right function arguments, attributes, calling convention etc., and if a `ret void` was sure to be reached after traversal with some ad-hoc constant folding following the call. This was brittle, as the kind of implicit variants required for a tail call to happen could easily be broken by other passes (e.g. if some instruction got in between the `resume` and `ret`), see for example 9d1cb18d19862fc0627e4a56e1e491a498e84c71 and 284da049f5feb62b40f5abc41dda7895e3d81d72. Also the logic seemed backwards: instead of searching for possible tail call candidates and doing them if the circumstances are right, it seems better to start with the intention of making the tail calls we need, and forcing the circumstances to be right. Now that we have the `llvm.coro.await.suspend.handle` intrinsic (since f78688134026686288a8d310b493d9327753a022) which corresponds exactly to symmetric transfer, change the lowering of that to also include the `resume` part, always lowered as a tail call.	2024-05-15 15:29:08 +02:00
Reid Kleckner	aa0776de46	Revert "[Pipelines] Do not run CoroSplit and CoroCleanup in LTO pre-link pipeline (#90310 )" and related patches This change is incorrect when thinlto and asan are enabled, and this can be observed by adding `-fsanitize=address` to the provided coro-elide-thinlto.cpp test. It results in the error "Coroutines cannot handle non static allocas yet", and ASan introduces a dynamic alloca. In other words, we must preserve the invariant that CoroSplit runs before ASan. If we move CoroSplit to the post post-link compile stage, ASan has to be moved to the post-link compile stage first. It would also be correct to make CoroSplit handle dynamic allocas so the pass ordering doesn't matter, but sanitizer instrumentation really ought to be last, after coroutine splitting. This reverts commit bafc5f42c0132171287d7cba7f5c14459be1f7b7. This reverts commit b1b1bfa7bea0ce489b5ea9134e17a43c695df5ec. This reverts commit 0232b77e145577ab78e3ed1fdbb7eacc5a7381ab. This reverts commit fb2d3056618e3d03ba9a695627c7b002458e59f0. This reverts commit 1cb33713910501c6352d0eb2a15b7a15e6e18695. This reverts commit cd68d7b3c0ebf6da5e235cfabd5e6381737eb7fe.	2024-05-10 21:28:13 +00:00
Paul T Robinson	3ceacd8b95	[Coro] Relax a debug-info test (#91401 ) Debug-info metadata does not have a strictly defined order. Check that elements are linked to each other correctly, not that metadata appears in a particular order.	2024-05-08 06:37:24 -07:00
Fangrui Song	7c1d9b15ee	[test] %clang_cc1: remove redundant actions	2024-05-04 23:08:11 -07:00
Fangrui Song	0d501f38f3	[test] %clang_cc1 -emit-llvm: remove redundant -S Also replace aarch64-none-linux-gnu (none can indicate an OS as well) with aarch64	2024-05-04 17:15:51 -07:00
Fangrui Song	c5de4dd1ea	[test] %clang_cc1 -emit-llvm: remove redundant -S And replace -emit-llvm -o - with -emit-llvm-only	2024-05-04 17:00:29 -07:00
Wei Wang	b1b1bfa7be	[Coroutines][Test] Only run coro-elide-thinlto under x86_64-linux (#90672 ) Previous fix #90549 didn't completely address the Buildbot failures. Some target may not recognize the target triple. This time, only run the test under x86_64-linux.	2024-04-30 18:08:40 -07:00
Wei Wang	0232b77e14	[Coroutines][Test] Specify target triple in coro-elide-thinlto (#90549 ) Resolve test failure on non-x86 linux host	2024-04-30 14:31:31 -07:00
Danial Klimkin	fb2d305661	Fix output in coro-elide-thinlto.cpp (#90579 ) Current dir can be read-only. Use a temp path instead.	2024-04-30 11:42:13 +02:00
David Blaikie	1cb3371391	Ensure test writes objects to test temp dir	2024-04-29 23:50:18 +00:00
Wei Wang	cd68d7b3c0	[Pipelines] Do not run CoroSplit and CoroCleanup in LTO pre-link pipeline (#90310 ) Skip CoroSplit and CoroCleanup in LTO pre-link pipeline so that CoroElide can happen after callee coroutine is imported into caller's module in ThinLTO.	2024-04-29 10:24:53 -07:00
Utkarsh Saxena	d72146f471	Re-apply "Emit missing cleanups for stmt-expr" and other commits (#89154 ) Latest diff: `f1ab4c2677..adf9bc902b` We address two additional bugs here: ### Problem 1: Deactivated normal cleanup still runs, leading to double-free Consider the following: ```cpp struct A { }; struct B { B(const A&); }; struct S { A a; B b; }; int AcceptS(S s); void Accept2(int x, int y); void Test() { Accept2(AcceptS({.a = A{}, .b = A{}}), ({ return; 0; })); } ``` We add cleanups as follows: 1. push dtor for field `S::a` 2. push dtor for temp `A{}` (used by ` B(const A&)` in `.b = A{}`) 3. push dtor for field `S::b` 4. Deactivate 3 `S::b`-> This pops the cleanup. 5. Deactivate 1 `S::a` -> Does not pop the cleanup as 2 is top. Should create _active flag_!! 6. push dtor for `~S()`. 7. ... It is important to deactivate 5 using active flags. Without the active flags, the `return` will fallthrough it and would run both `~S()` and dtor `S::a` leading to double free of `~A()`. In this patch, we unconditionally emit active flags while deactivating normal cleanups. These flags are deleted later by the `AllocaTracker` if the cleanup is not emitted. ### Problem 2: Missing cleanup for conditional lifetime extension We push 2 cleanups for lifetime-extended cleanup. The first cleanup is useful if we exit from the middle of the expression (stmt-expr/coro suspensions). This is deactivated after full-expr, and a new cleanup is pushed, extending the lifetime of the temporaries (to the scope of the reference being initialized). If this lifetime extension happens to be conditional, then we use active flags to remember whether the branch was taken and if the object was initialized. Previously, we used a single active flag, which was used by both cleanups. This is wrong because the first cleanup will be forced to deactivate after the full-expr and therefore this active flag will always be inactive. The dtor for the lifetime extended entity would not run as it always sees an inactive flag. In this patch, we solve this using two separate active flags for both cleanups. Both of them are activated if the conditional branch is taken, but only one of them is deactivated after the full-expr. --- Fixes https://github.com/llvm/llvm-project/issues/63818 Fixes https://github.com/llvm/llvm-project/issues/88478 --- Previous PR logs: 1. https://github.com/llvm/llvm-project/pull/85398 2. https://github.com/llvm/llvm-project/pull/88670 3. https://github.com/llvm/llvm-project/pull/88751 4. https://github.com/llvm/llvm-project/pull/88884	2024-04-29 12:33:46 +02:00
Vitaly Buka	2e5035aeed	Revert "[clang] Enable sized deallocation by default in C++14 onwards (#83774 )" (#90299 ) https://lab.llvm.org/buildbot/#/builders/168/builds/20063 (should be fixed with #90292) More details in #83774 This reverts commit cf5a8b489464d09dfdd7a48ce7c8b41d3c9bf819.	2024-04-26 17:14:43 -07:00
Pengcheng Wang	cf5a8b4894	[clang] Enable sized deallocation by default in C++14 onwards (#83774 ) Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. This is another try of https://reviews.llvm.org/D112921. Fixes #60061	2024-04-26 16:59:12 +08:00
Haojian Wu	dc8f6a8cda	[clang] coroutine: generate valid mangled name in CodeGenFunction::generateAwaitSuspendWrapper (#89731 ) Fixes https://github.com/llvm/llvm-project/issues/89723	2024-04-23 21:09:36 +02:00
Utkarsh Saxena	9d8be24087	Revert "[codegen] Emit missing cleanups for stmt-expr and coro suspensions" and related commits (#88884 ) The original change caused widespread breakages in msan/ubsan tests and causes `use-after-free`. Most likely we are adding more cleanups than necessary.	2024-04-16 15:30:32 +02:00
Utkarsh Saxena	89ba7e183e	[codegen] Emit missing cleanups for stmt-expr and coro suspensions [take-2] (#85398 ) Fixes https://github.com/llvm/llvm-project/issues/63818 for control flow out of an expressions. #### Background A control flow could happen in the middle of an expression due to stmt-expr and coroutine suspensions. Due to branch-in-expr, we missed running cleanups for the temporaries constructed in the expression before the branch. Previously, these cleanups were only added as `EHCleanup` during the expression and as normal expression after the full expression. Examples of such deferred cleanups include: `ParenList/InitList`: Cleanups for fields are performed by the destructor of the object being constructed. `Array init`: Cleanup for elements of an array is included in the array cleanup. `Lifetime-extended temporaries`: reference-binding temporaries in braced-init are lifetime extended to the parent scope. `Lambda capture init`: init in the lambda capture list is destroyed by the lambda object. --- #### In this PR In this PR, we change some of the `EHCleanups` cleanups to `NormalAndEHCleanups` to make sure these are emitted when we see a branch inside an expression (through statement expressions or coroutine suspensions). These are supposed to be deactivated after full expression and destroyed later as part of the destructor of the aggregate or array being constructed. To simplify deactivating cleanups, we add two utilities as well: * `DeferredDeactivationCleanupStack`: A stack to remember cleanups with deferred deactivation. * `CleanupDeactivationScope`: RAII for deactivating cleanups added to the above stack. --- #### Deactivating normal cleanups These were previously `EHCleanups` and not `Normal` and deactivation of required `Normal` cleanups had some bugs. These specifically include deactivating `Normal` cleanups which are not the top of `EHStack` [source1](`92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L1319)`), [2](`92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L722-L746)`). This has not been part of our test suite (maybe it was never required before statement expressions). In this PR, we also fix the emission of required-deactivated-normal cleanups.	2024-04-10 12:59:24 +02:00
Hans	9d1cb18d19	[Coroutines] Ignore instructions more aggressively in addMustTailToCoroResumes() (#85271 ) The old code used isInstructionTriviallyDead() and removed instructions when walking the path from a resume call to function return to check if the call is in tail position. However, since the code was walking forwards it was not able to get past instructions such as: %gep = getelementptr inbounds i64, ptr %alloc.var, i32 0 %foo = ptrtoint ptr %gep to i64 This patch instead ignores such instructions as long as their values are not needed. This enables the code to emit tail calls in more situations.	2024-03-20 14:51:45 +01:00
fpasserby	f786881340	[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712 ) Implement `llvm.coro.await.suspend` intrinsics, to deal with performance regression after prohibiting `.await_suspend` inlining, as suggested in #64945. Actually, there are three new intrinsics, which directly correspond to each of three forms of `await_suspend`: ``` void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction) i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction) ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction) ``` There are three different versions instead of one, because in `bool` case it's result is used for resuming via a branch, and in `coroutine_handle` case exceptions from `await_suspend` are handled in the coroutine, and exceptions from the subsequent `.resume()` are propagated to the caller. Await-suspend block is simplified down to intrinsic calls only, for example for symmetric transfer: ``` %id = call token @llvm.coro.save(ptr null) %handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction) call void @llvm.coro.resume(%handle) %result = call i8 @llvm.coro.suspend(token %id, i1 false) switch i8 %result, ... ``` All await-suspend logic is moved out into a wrapper function, generated for each suspension point. The signature of the function is `<type> wrapperFunction(ptr %awaiter, ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on the return type of `await_suspend`. Intrinsic calls are lowered during `CoroSplit` pass, right after the split. Because I'm new to LLVM, I'm not sure if the helper function generation, calls to them and lowering are implemented in the right way, especially with regard to various metadata and attributes, i. e. for TBAA. All things that seemed questionable are marked with `FIXME` comments. There is another detail: in case of symmetric transfer raw pointer to the frame of coroutine, that should be resumed, is returned from the helper function and a direct call to `@llvm.coro.resume` is generated. C++ standard demands, that `.resume()` method is evaluated. Not sure how important is this, because code has been generated in the same way before, sans helper function.	2024-03-11 10:00:00 +08:00
Nikita Popov	158d72d728	[Clang] Set writable and dead_on_unwind attributes on sret arguments (#77116 ) Set the writable and dead_on_unwind attributes for sret arguments. These indicate that the argument points to writable memory (and it's legal to introduce spurious writes to it on entry to the function) and that the argument memory will not be used if the call unwinds. This enables additional MemCpyOpt/DSE/LICM optimizations.	2024-01-11 09:46:54 +01:00
Yuxuan Chen	4a294b5806	[Clang] CGCoroutines skip emitting try block for value returning `noexcept` init `await_resume` calls (#73160 ) Previously we were not properly skipping the generation of the `try { }` block around the `init_suspend.await_resume()` if the `await_resume` is not returning void. The reason being that the resume expression was wrapped in a `CXXBindTemporaryExpr` and the first dyn_cast failed, silently ignoring the noexcept. This only mattered for `init_suspend` because it had its own try block. This patch changes to first extract the sub expression when we see a `CXXBindTemporaryExpr`. Then perform the same logic to check for `noexcept`. Another version of this patch also wanted to assert the second step by `cast<CXXMemberCallExpr>` and as far as I understand it should be a valid assumption. I can change to that if upstream prefers.	2023-11-28 19:04:29 -08:00
Yuxuan Chen	1fad78b123	[Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (#73073 ) This change aims to fix an ICE in issue https://github.com/llvm/llvm-project/issues/63803 The crash happens in `ExitCXXTryStmt` because `EmitAnyExpr()` adds additional cleanup to the `EHScopeStack`. This messes up the assumption in `ExitCXXTryStmt` that the top of the stack should be a `EHCatchScope`. However, since we never read a value returned from `await_resume()` of an init suspend, we can skip the part that builds this `RValue`. The code here may not be in the best shape. There's another bug that `memberCallExpressionCanThrow` doesn't work on the current Expr due to type mismatch. I am preparing a separate PR to address it plus some refactoring might be beneficial.	2023-11-21 21:21:27 -08:00
Chuanqi Xu	b7b5907b56	[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014 ) Close https://github.com/llvm/llvm-project/issues/56980. This patch tries to introduce a light-weight optimization attribute for coroutines which are guaranteed to only be destroyed after it reached the final suspend. The rationale behind the patch is simple. See the example: ```C++ A foo() { dtor d; co_await something(); dtor d1; co_await something(); dtor d2; co_return 43; } ``` Generally the generated .destroy function may be: ```C++ void foo.destroy(foo.Frame frame) { switch(frame->suspend_index()) { case 1: frame->d.~dtor(); break; case 2: frame->d.~dtor(); frame->d1.~dtor(); break; case 3: frame->d.~dtor(); frame->d1.~dtor(); frame->d2.~dtor(); break; default: // coroutine completed or haven't started break; } frame->promise.~promise_type(); delete frame; } ``` Since the compiler need to be ready for all the cases that the coroutine may be destroyed in a valid state. However, from the user's perspective, we can understand that certain coroutine types may only be destroyed after it reached to the final suspend point. And we need a method to teach the compiler about this. Then this is the patch. After the compiler recognized that the coroutines can only be destroyed after complete, it can optimize the above example to: ```C++ void foo.destroy(foo.Frame frame) { frame->promise.~promise_type(); delete frame; } ``` I spent a lot of time experimenting and experiencing this in the downstream. The numbers are really good. In a real-world coroutine-heavy workload, the size of the build dir (including .o files) reduces 14%. And the size of final libraries (excluding the .o files) reduces 8% in Debug mode and 1% in Release mode.	2023-11-09 14:42:07 +08:00
Fangrui Song	c0a73918bf	[ItaniumCXXABI] Add -fassume-nothrow-exception-dtor to assume that all exception objects' destructors are non-throwing Link: https://lists.llvm.org/pipermail/cfe-dev/2021-August/068740.html ("[Exception Handling] Could we mark __cxa_end_catch as nounwind conditionally?" Link: https://github.com/llvm/llvm-project/issues/57375 A catch handler calls `__cxa_begin_catch` and `__cxa_end_catch`. For a catch-all clause or a catch clause matching a record type, we: * assume that the exception object may have a throwing destructor * emit `invoke void @__cxa_end_catch` (as the call is not marked as the `nounwind` attribute). * emit a landing pad to destroy local variables and call `_Unwind_Resume` ``` struct A { ~A(); }; struct B { int x; }; void opaque(); void foo() { A a; try { opaque(); } catch (...) { } // the exception object has an unknown type and may throw try { opaque(); } catch (B b) { } // B::~B is nothrow, but we do not utilize this } ``` Per C++ [dcl.fct.def.coroutine], a coroutine's function body implies a `catch (...)`. Our code generation pessimizes even simple code, like: ``` UserFacing foo() { A a; opaque(); co_return; // For `invoke void @__cxa_end_catch()`, the landing pad destroys the // promise_type and deletes the coro frame. } ``` Throwing destructors are typically discouraged. In many environments, the destructors of exception objects are guaranteed to never throw, making our conservative code generation approach seem wasteful. Furthermore, throwing destructors tend not to work well in practice: * GCC does not emit call site records for the region containing `__cxa_end_catch`. This has been a long time, since 2000. * If a catch-all clause catches an exception object that throws, both GCC and Clang using libstdc++ leak the allocated exception object. To avoid code generation pessimization, add an opt-in driver option -fassume-nothrow-exception-dtor to assume that `__cxa_end_catch` calls have the `nounwind` attribute. This implies that thrown exception objects' destructors will never throw. To detect misuses, diagnose throw expressions with a potentially-throwing destructor. Technically, it is possible that a potentially-throwing destructor never throws when called transitively by `__cxa_end_catch`, but these cases seem rare enough to justify a relaxed mode. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D108905	2023-11-05 00:39:38 -07:00
Bruno Cardoso Lopes	34415fd611	[Clang][LLVM][Coroutines] Prevent __coro_gro from outliving __promise (#66706 ) When dealing with short-circuiting coroutines (e.g. expected), the deferred calls that resolve the get_return_object are currently being emitted after we delete the coroutine frame. This was caught by ASAN when using optimizations -O1 and above: optimizations after inlining would place the __coro_gro in the heap, and subsequent delete of the coroframe followed by the conversion -> BOOM. This patch forbids the GRO to be placed in the coroutine frame, by adding a new metadata node that can be attached to `alloca` instructions. Fix #49843	2023-09-21 22:52:05 -07:00
Anton Korobeynikov	51d5d7bbae	Extend `retcon.once` coroutines lowering to optionally produce a normal result (#66333 ) One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes. However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return` is transformed to a member / callback call on promise object). The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.	2023-09-15 09:54:38 -07:00
Alexander Kornienko	b7f4915644	Revert "Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas" This reverts commit e698695fbbf62e6676f8907665187f2d2c4d814b. The commit caused invalid AddressSanitizer: stack-use-after-scope errors. See https://reviews.llvm.org/D74094#4633785 for details. Differential Revision: https://reviews.llvm.org/D159346	2023-09-01 12:53:24 +02:00
Aaron Ballman	a02f9a7756	Revert "[clang] Enable sized deallocation by default in C++14 onwards" This reverts commit 2916b125f686115deab2ba573dcaff3847566ab9. Reverting due to failures on: https://lab.llvm.org/buildbot/#/builders/216/builds/26407 https://lab.llvm.org/staging/#/builders/247/builds/5659 http://45.33.8.238/win/83485/step_7.txt	2023-08-29 09:36:59 -04:00
wangpc	2916b125f6	[clang] Enable sized deallocation by default in C++14 onwards Since C++14 has been released for about nine years and most standard libraries have implemented sized deallocation functions, it's time to make this feature default again. Reviewed By: rnk, aaron.ballman, #libc, ldionne, Mordante, MaskRay Differential Revision: https://reviews.llvm.org/D112921	2023-08-29 15:42:50 +08:00
Chuanqi Xu	20e6515d5c	[Coroutines] Mark 'coroutine_handle<>::address' as always-inline Close https://github.com/llvm/llvm-project/issues/65054 The direct issue is still the call to coroutine_handle<>::address() after await_suspend(). Without optimizations, the current logic will put the temporary result of await_suspend() to the coroutine frame since the middle end feel the temporary is escaped from coroutine_handle<>::address. To fix this fundamentally, we should wrap the whole logic about await-suspend into a standalone function. See https://github.com/llvm/llvm-project/issues/64945 And as a short-term workaround, we probably can mark coroutine_handle<>::address() as always-inline so that the temporary result may not be thought to be escaped then it won't be put on the coroutine frame. Although it looks dirty, it is probably do-able since the compiler are allowed to do special tricks to standard library components.	2023-08-29 14:35:27 +08:00
Chuanqi Xu	6f1b2e4e97	[NFC] Correct the test code in pr65018 The test code in pr65018 is actually incorrect since the optimizier are free to optimize the whole coroutine body away. This patch corrected this.	2023-08-29 11:30:47 +08:00
Chuanqi Xu	b32aa72afc	Recommit [C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty The original patch is incorrect since it marks too many calls to be noinline. It shows that it is bad to do analysis in the frontend again. This patch tries to mark the await_suspend function as noinlne only. --- Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151 Close https://github.com/llvm/llvm-project/issues/65018 See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context. As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend function. This looks slightly problematic since the users are able to call the await_suspend function standalone. This is limited by the implementation. On the one hand, we don't want the workaround solution (See the proposed solution later) to be too complex. On the other hand, it is rare to call await_suspend standalone. Also it is not semantically incorrect to do so since the inlining is not part of the C++ standard. Also as an optimization, we don't add the `noinline` attribute to the await_suspend function if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common. The long term solution is: call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn) Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157833	2023-08-28 17:07:30 +08:00
Chuanqi Xu	572cc8d38f	Revert "[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty" This reverts commit 9d9c25f81456aace2bec4b58498a420e650007d9. This reverts commit 19ab2664ad3182ffa8fe3a95bb19765e4ae84653. This reverts commit c4672454743e942f148a1aff1e809dae73e464f6. As the issue https://github.com/llvm/llvm-project/issues/65018 shows, the previous fix introduce a regression actually. So this commit reverts the fix by our policies.	2023-08-28 13:21:17 +08:00
Chuanqi Xu	9d9c25f814	[C++20] [Coroutines] Don't mark await_suspend as noinline if it is specified as always_inline already Address https://github.com/llvm/llvm-project/issues/64933 and partially https://github.com/llvm/llvm-project/issues/64945. After c467245, we will add a noinline attribute to the await_suspend member function of an awaiter if the awaiter has any non static member functions. Obviously, this decision will bring some performance regressions. And people may complain about this while the long term solution may not be available soon. In such cases, it is better to provide a solution for the users who met the regression surprisingly. Also it is natural to not prevent the inlining if the function is marked as always_inline by the users already.	2023-08-28 11:43:33 +08:00
Chuanqi Xu	7037331a2f	[Coroutines] [CoroElide] Don't think exceptional terminator don't leak coro handle unconditionally any more Close https://github.com/llvm/llvm-project/issues/59723. The fundamental cause of the above issue is that we assumed the memory of coroutine frame can be released by stack unwinding automatically if the allocation of the coroutine frame is elided. But we missed one point: the stack unwinding has different semantics with the explicit coroutine_handle<>::destroy(). Since the latter is explicit so it shows the intention of the user. So we can blame the user to destroy the coroutine frame incorrectly in case of use-after-free happens. But we can't do so with stack unwinding. So after this patch, we won't think the exceptional terminator don't leak the coroutine handle unconditionally. Instead, we think the exceptional terminator will leak the coroutine handle too if the coroutine is leaked somewhere along the search path. Concretely for C++, we can think the exceptional terminator is not special any more. Maybe this may cause some performance regressions. But I've tested the motivating example (std::generator). And on the other side, the coroutine elision is a middle end opitmization and not a language feature. So we don't think we should blame such regressions especially we are correcting the miscompilations.	2023-08-23 16:51:53 +08:00
Chuanqi Xu	c467245474	[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151 See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context. As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend call. Also as an optimization, we don't add the `noinline` attribute to the await_suspend call if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common. Another potential optimization is: call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn) Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157833	2023-08-22 09:56:44 +08:00
Erik Pilkington	e698695fbb	Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas This reverts commit e26c24b849211f35a988d001753e0cd15e4a9d7b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: https://github.com/llvm/llvm-project/issues/38157 Link: https://github.com/llvm/llvm-project/issues/41896 Link: https://github.com/llvm/llvm-project/issues/43598 Link: https://github.com/ClangBuiltLinux/linux/issues/39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094	2023-08-16 15:21:46 -07:00
Chuanqi Xu	21765af763	[C++] [Coroutines] Assume the allocation doesn't return nullptr In case of 'get_return_object_on_allocation_failure' get declared, the compiler is required to call 'operator new(size_t, nothrow_t)' and the handle the failure case by calling 'get_return_object_on_allocation_failure()'. But the failure case should be rare and we can assume the allocation is successful and pass the information to the optimizer.	2023-06-26 14:37:25 +08:00
Sergei Barannikov	f46b0e6d75	[clang] Convert a few tests to opaque pointers Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D150520	2023-05-14 21:00:15 +03:00
Nikita Popov	243e62b9d8	[Coroutines] Directly remove unnecessary lifetime intrinsics The insertSpills() code will currently skip lifetime intrinsic users when replacing the alloca with a frame reference. Rather than leaving behind the dead lifetime intrinsics working on the old alloca, directly remove them. This makes sure the alloca can be dropped as well. I noticed this as a regression when converting tests to opaque pointers. Without opaque pointers, this code didn't really do anything, because there would usually be a bitcast in between. The lifetimes would get rewritten to the frame pointer. With opaque pointers, this code now triggers and leaves behind users of the old allocas. Differential Revision: https://reviews.llvm.org/D148240	2023-04-14 10:22:30 +02:00

1 2 3 4

195 Commits