llvm-project

History

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

This patch is the frontend implementation of the coroutine elide
improvement project detailed in this discourse post:
https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044

This patch proposes a C++ struct/class attribute
`[[clang::coro_await_elidable]]`. This notion of await elidable task
gives developers and library authors a certainty that coroutine heap
elision happens in a predictable way.

Originally, after we lower a coroutine to LLVM IR, CoroElide is
responsible for analysis of whether an elision can happen. Take this as
an example:
```
Task foo();
Task bar() {
  co_await foo();
}
```
For CoroElide to happen, the ramp function of `foo` must be inlined into
`bar`. This inlining happens after `foo` has been split but `bar` is
usually still a presplit coroutine. If `foo` is indeed a coroutine, the
inlined `coro.id` intrinsics of `foo` is visible within `bar`. CoroElide
then runs an analysis to figure out whether the SSA value of
`coro.begin()` of `foo` gets destroyed before `bar` terminates.

`Task` types are rarely simple enough for the destroy logic of the task
to reference the SSA value from `coro.begin()` directly. Hence, the pass
is very ineffective for even the most trivial C++ Task types. Improving
CoroElide by implementing more powerful analyses is possible, however it
doesn't give us the predictability when we expect elision to happen.

The approach we want to take with this language extension generally
originates from the philosophy that library implementations of `Task`
types has the control over the structured concurrency guarantees we
demand for elision to happen. That is, the lifetime for the callee's
frame is shorter to that of the caller.

The ``[[clang::coro_await_elidable]]`` is a class attribute which can be
applied to a coroutine return type.

When a coroutine function that returns such a type calls another
coroutine function, the compiler performs heap allocation elision when
the following conditions are all met:
- callee coroutine function returns a type that is annotated with
``[[clang::coro_await_elidable]]``.
- In caller coroutine, the return value of the callee is a prvalue that
is immediately `co_await`ed.

From the C++ perspective, it makes sense because we can ensure the
lifetime of elided callee cannot exceed that of the caller if we can
guarantee that the caller coroutine is never destroyed earlier than the
callee coroutine. This is not generally true for any C++ programs.
However, the library that implements `Task` types and executors may
provide this guarantee to the compiler, providing the user with
certainty that HALO will work on their programs.

After this patch, when compiling coroutines that return a type with such
attribute, the frontend checks that the type of the operand of
`co_await` expressions (not `operator co_await`). If it's also
attributed with `[[clang::coro_await_elidable]]`, the FE emits metadata
on the call or invoke instruction as a hint for a later middle end pass
to elide the elision.

The original patch version is
https://github.com/llvm/llvm-project/pull/94693 and as suggested, the
patch is split into frontend and middle end solutions into stacked PRs.

The middle end CoroSplit patch can be found at
https://github.com/llvm/llvm-project/pull/99283
The middle end transformation that performs the elide can be found at
https://github.com/llvm/llvm-project/pull/99285

2024-09-08 23:08:58 -07:00

Targets

[clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (#102776 )

2024-08-21 13:16:59 +01:00

ABIInfo.cpp

[clang][CodeGen] Return RValue from EmitVAArg (#94635 )

2024-06-17 13:29:20 +02:00

ABIInfo.h

[clang][CodeGen] Return RValue from EmitVAArg (#94635 )

2024-06-17 13:29:20 +02:00

ABIInfoImpl.cpp

[PowerPC] Fix codegen for transparent_union function params (#101738 )

2024-08-19 12:17:44 -04:00

ABIInfoImpl.h

[clang][CGRecordLayout] Remove dependency on isZeroSize (#96422 )

2024-07-16 04:59:51 +01:00

Address.h

[PAC] Authenticate function pointers in UBSan type checks (#99590 )

2024-07-19 08:27:16 -07:00

BackendConsumer.h

[BPF] Fix linking issues in static map initializers (#91310 )

2024-07-05 07:32:09 -07:00

BackendUtil.cpp

[LTO] Reduce memory usage for import lists (#106772 )

2024-09-01 08:36:06 -07:00

CGAtomic.cpp

clang: Allow targets to set custom metadata on atomics (#96906 )

2024-07-26 09:57:28 +04:00

CGBlocks.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGBlocks.h

[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 )

2024-03-28 06:54:36 -07:00

CGBuilder.h

Reland "[clang] Add nuw attribute to GEPs (#105496 )" (#107257 )

2024-09-05 16:13:11 +01:00

CGBuiltin.cpp

[AMDGPU] Add target intrinsic for s_prefetch_data (#107133 )

2024-09-05 15:14:31 -07:00

CGCall.cpp

[HLSL] Implement output parameter (#101083 )

2024-08-31 10:59:08 -05:00

CGCall.h

[clang] Fix FnInfoOpts::operator&= and FnInfoOpts::operator|= not updating assigned operands (#107050 )

2024-09-05 12:35:12 -07:00

CGClass.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGCleanup.cpp

[NFC][Clang] clang-format a function declaration

2024-08-12 09:49:55 +01:00

CGCleanup.h

Re-apply "Emit missing cleanups for stmt-expr" and other commits (#89154 )

2024-04-29 12:33:46 +02:00

CGCoroutine.cpp

[DebugInfo][RemoveDIs] Use iterator-inserters in clang (#102006 )

2024-08-09 10:17:48 +01:00

CGCUDANV.cpp

[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549 )

2024-08-12 17:44:58 -07:00

CGCUDARuntime.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGCUDARuntime.h

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGCXX.cpp

[clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (#94056 )

2024-06-26 18:35:10 -07:00

CGCXXABI.cpp

[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 )

2024-03-28 06:54:36 -07:00

CGCXXABI.h

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGDebugInfo.cpp

[HLSL] Add HLSLAttributedResourceType (#106181 )

2024-08-29 21:42:20 -07:00

CGDebugInfo.h

[HLSL] Implement intangible AST type (#97362 )

2024-08-05 10:50:34 -07:00

CGDecl.cpp

[IR] Add method to GlobalVariable to change type of initializer. (#102553 )

2024-08-09 09:22:40 -07:00

CGDeclCXX.cpp

[clang] Fix FIXME in dynamic initializer emission, NFCI

2024-09-04 17:34:26 +00:00

CGException.cpp

[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465 )

2024-05-20 10:23:04 -07:00

CGExpr.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGExprAgg.cpp

[HLSL] Implement output parameter (#101083 )

2024-08-31 10:59:08 -05:00

CGExprComplex.cpp

[NFC][Clang] Clean up VisitUnaryPlus by removing unused FP feature check (#101412 )

2024-07-31 18:25:52 -05:00

CGExprConstant.cpp

[PAC] Incorrect codegen for constant global init with polymorphic MI (#99741 )

2024-07-21 18:59:33 -07:00

CGExprCXX.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CGExprScalar.cpp

Reland "[clang] Add nuw attribute to GEPs (#105496 )" (#107257 )

2024-09-05 16:13:11 +01:00

CGGPUBuiltin.cpp

Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )"

2024-07-26 17:21:56 -05:00

CGHLSLRuntime.cpp

[HLSL] Apply resource attributes to the resource type rather than the handle member (#107160 )

2024-09-05 21:50:00 -07:00

CGHLSLRuntime.h

[clang][HLSL] Add WaveIsFirstLane() intrinsic (#103299 )

2024-09-04 11:27:03 +02:00

CGLoopInfo.cpp

[HLSL] add loop unroll (#93879 )

2024-07-11 17:08:13 -04:00

CGLoopInfo.h

[clang][HLSL][SPRI-V] Add convergence intrinsics (#80680 )

2024-03-28 17:18:05 +01:00

CGNonTrivialStruct.cpp

[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465 )

2024-05-20 10:23:04 -07:00

CGObjC.cpp

[DebugInfo][RemoveDIs] Use iterator-inserters in clang (#102006 )

2024-08-09 10:17:48 +01:00

CGObjCGNU.cpp

[DataLayout] Remove constructor accepting a pointer to Module (#102841 )

2024-08-13 04:00:19 +03:00

CGObjCMac.cpp

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )"

2024-06-24 18:00:22 +01:00

CGObjCRuntime.cpp

[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 )

2024-03-28 06:54:36 -07:00

CGObjCRuntime.h

…

CGOpenCLRuntime.cpp

[CodeGenOpenCL] Remove pointer type caching

2023-11-15 00:37:44 +01:00

CGOpenCLRuntime.h

[CodeGenOpenCL] Remove pointer type caching

2023-11-15 00:37:44 +01:00

CGOpenMPRuntime.cpp

[CGOpenMPRuntime] Avoid repeated hash lookups (NFC) (#107358 )

2024-09-05 08:36:09 -07:00

CGOpenMPRuntime.h

[OpenMP][offload] Fix dynamic schedule tracking (#97065 )

2024-07-01 10:23:11 -04:00

CGOpenMPRuntimeGPU.cpp

[OpenMP] Map omp_default_mem_alloc to global memory (#104790 )

2024-08-20 12:00:41 -05:00

CGOpenMPRuntimeGPU.h

[OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (#80343 )

2024-06-26 20:18:38 +01:00

CGPointerAuth.cpp

[PAC] Implement authentication for C++ member function pointers (#99576 )

2024-07-22 18:29:06 -07:00

CGPointerAuthInfo.h

[clang] Implement function pointer signing and authenticated function calls (#93906 )

2024-06-21 10:20:15 -07:00

CGRecordLayout.h

[Clang] Ignore empty FieldDecls when asking for the field number (#100040 )

2024-07-23 00:49:46 -07:00

CGRecordLayoutBuilder.cpp

[clang][CGRecordLayout] Remove dependency on isZeroSize (#96422 )

2024-07-16 04:59:51 +01:00

CGStmt.cpp

[Clang][CodeGen] Don't emit assumptions if current block is unreachable. (#106936 )

2024-09-04 13:36:32 +08:00

CGStmtOpenMP.cpp

[Clang][Sema][OpenMP] Allow thread_limit to accept multiple expressions (#102715 )

2024-08-10 09:54:58 -04:00

CGValue.h

[PAC] Implement function pointer re-signing (#98847 )

2024-07-18 07:51:17 -07:00

CGVTables.cpp

Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287 )

2024-08-08 13:14:09 +08:00

CGVTables.h

Fix typo "indicies" (#92232 )

2024-05-15 13:10:16 +01:00

CGVTT.cpp

[clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (#94056 )

2024-06-26 18:35:10 -07:00

CMakeLists.txt

[clang] Split ObjectFilePCHContainerReader from ObjectFilePCHContainerWriter (#99599 )

2024-07-23 23:55:31 +08:00

CodeGenABITypes.cpp

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )"

2024-06-24 18:00:22 +01:00

CodeGenAction.cpp

Revert "[C++20] [Modules] Embed all source files for C++20 Modules (#102444 )"

2024-09-03 10:54:20 +08:00

CodeGenFunction.cpp

[HLSL] Add HLSLAttributedResourceType (#106181 )

2024-08-29 21:42:20 -07:00

CodeGenFunction.h

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

CodeGenModule.cpp

[CodeGen] Create IFUNCs in the program address space, not hard-coded 0 (#105726 )

2024-08-28 17:11:15 +01:00

CodeGenModule.h

[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138 )

2024-08-21 07:22:31 +02:00

CodeGenPGO.cpp

[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587 ) (#102691 )

2024-08-22 01:10:54 -05:00

CodeGenPGO.h

[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 )

2024-03-28 06:54:36 -07:00

CodeGenTBAA.cpp

[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138 )

2024-08-21 07:22:31 +02:00

CodeGenTBAA.h

[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138 )

2024-08-21 07:22:31 +02:00

CodeGenTypeCache.h

[Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182 )

2024-05-19 14:59:03 +01:00

CodeGenTypes.cpp

[llvm][RISCV] Support RISCV vector tuple type in llvm IR (#97992 )

2024-08-31 18:59:47 +08:00

CodeGenTypes.h

[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138 )

2024-08-21 07:22:31 +02:00

ConstantEmitter.h

[PAC] Implement function pointer type discrimination (#96992 )

2024-07-11 09:09:20 -07:00

ConstantInitBuilder.cpp

[clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (#94056 )

2024-06-26 18:35:10 -07:00

CoverageMappingGen.cpp

[CodeGen] Avoid repeated hash lookups (NFC) (#107759 )

2024-09-08 09:04:20 -07:00

CoverageMappingGen.h

Move SystemHeadersCoverage into llvm::coverage in CoverageMappingGen.h

2024-07-09 22:21:20 +09:00

EHScopeStack.h

[CodeGen] Modernize EHScopeStack::Cleanup::Flags (NFC)

2023-09-02 09:32:36 -07:00

ItaniumCXXABI.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

LinkInModulesPass.cpp

[clang][CodeGen] Cleanup missed ShouldLinkFiles definitions (#97115 )

2024-06-28 15:22:42 -07:00

LinkInModulesPass.h

[clang][CodeGen] Cleanup missed ShouldLinkFiles definitions (#97115 )

2024-06-28 15:22:42 -07:00

MacroPPCallbacks.cpp

[clang][lex] Always pass suggested module to InclusionDirective() callback (#81061 )

2024-02-08 10:19:18 -08:00

MacroPPCallbacks.h

[clang][lex] Always pass suggested module to InclusionDirective() callback (#81061 )

2024-02-08 10:19:18 -08:00

MCDCState.h

Reapply: [MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448 )

2024-06-14 19:31:56 +09:00

MicrosoftCXXABI.cpp

[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282 )

2024-09-08 23:08:58 -07:00

ModuleBuilder.cpp

[BPF] Fix linking issues in static map initializers (#91310 )

2024-07-05 07:32:09 -07:00

ObjectFilePCHContainerWriter.cpp

[clang] Split ObjectFilePCHContainerReader from ObjectFilePCHContainerWriter (#99599 )

2024-07-23 23:55:31 +08:00

PatternInit.cpp

…

PatternInit.h

…

README.txt

…

SanitizerMetadata.cpp

…

SanitizerMetadata.h

…

SwiftCallingConv.cpp

[NFC] Refactor ConstantArrayType size storage (#85716 )

2024-03-26 14:15:56 -05:00

TargetInfo.cpp

[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978 )

2024-08-09 17:51:38 +02:00

TargetInfo.h

[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978 )

2024-08-09 17:51:38 +02:00

VarBypassDetector.cpp

…

VarBypassDetector.h

…

README.txt

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//