llvm-project

Author	SHA1	Message	Date
Tom Honermann	23e4fe040b	[SYCL] SYCL host kernel launch support for the sycl_kernel_entry_point attribute. (#152403 ) The `sycl_kernel_entry_point` attribute facilitates the generation of an offload kernel entry point function based on the parameters and body of the attributed function. This change extends the behavior of that attribute to support integration with a SYCL runtime library through an interface that communicates symbol names and kernel arguments for the generated offload kernel entry point functions. Consider the following function declared with the `sycl_kernel_entry_point` attribute with a call to this function occurring in the implementation of a SYCL kernel invocation function such as `sycl::handler::single_task()`. ```c++ template<typename KernelName, typename KernelType> [[clang::sycl_kernel_entry_point(KernelName)]] void kernel_entry_point(KernelType kernel) { kernel(); } ``` The body of the above function specifies the parameters and body of the generated offload kernel entry point. Clearly, a call to the above function by a SYCL kernel invocation function is not intended to execute the body as written. Previously, code generation emitted an empty function body so that calls to the function had no effect other than to trigger the generation of the offload kernel entry point. The function body is therefore available to hook for SYCL library support and is now substituted with a call to a (SYCL library provided) function template or variable template named `sycl_kernel_launch()` with the kernel name type passed as the first template argument, the symbol name of the offload kernel entry point passed as a string literal for the first function argument, and the function parameters passed as the remaining explicit function arguments. Given a call like this: ```c++ kernel_entry_point<struct KN>([]{}) ``` the body of the instantiated `kernel_entry_point()` specialization would be substituted as follows with "kernel-symbol-name" substituted for the generated symbol name and `kernel` forwarded. ```c++ sycl_kernel_launch<KN>("kernel-symbol-name", kernel) ``` Name lookup and overload resolution for the `sycl_kernel_launch()` function is performed at the point of definition of the `sycl_kernel_entry_point` attributed function (or the point of instantiation for an instantiated function template specialization). If overload resolution fails, the program is ill-formed. Implementation of the `sycl_kernel_launch()` function might require additional information provided by the SYCL library. This is facilitated by removing the previous prohibition against use of the `sycl_kernel_entry_point` attribute with a non-static member function. If the `sycl_kernel_entry_point` attributed function is a non-static member function, then overload resolution for the `sycl_kernel_launch()` function template may select a non-static member function in which case, `this` will be implicitly passed as the implicit object argument. If a `sycl_kernel_entry_point` attributed function is a non-static member function, use of `this` in a potentially evaluated expression is prohibited in the definition since `this` is not a kernel argument and will not be available within the generated offload kernel entry point function. The attribute cannot be applied to a function with an explicit object parameter. --------- Co-authored-by: Mariya Podchishchaeva <mariya.podchishchaeva@intel.com>	2026-03-05 19:16:03 -05:00
Nikita Popov	c4721872af	Revert "[Clang][inlineasm] Add special support for "rm" output constraints (#92040 )" This change landed without approval. This reverts commit 45e666a8531c1148bdb170b9a120f99e1500c427. This reverts commit a636dd4c37f12594275de2fe180ca35bc04d76ea.	2026-02-14 15:59:04 +01:00
Bill Wendling	45e666a853	[Clang][inlineasm] Add special support for "rm" output constraints (#92040 ) Clang isn't able to support multiple constraints on inputs and outputs, like "rm". Instead, it picks the "safest" one to use, i.e. the memory constraint for "rm". This leads to obviously horrible code: asm __volatile__ ("pushf\n\t" "popq %0" : "=rm" (x)); is compiled to: pushf popq -8(%rsp) movq -8(%rsp), %rax It gets worse when inlined into other functions, because it may introduce a stack where none is needed. With this change, Clang now generates IR for the more optimistic choice ("r"). All but the fast register allocator are able to fold registers if it turns out that register pressure is too high. This leaves the fast register allocator. The fast register allocator, as the name suggests, is built for execution speed, not code quality. Thus, we add special processing to convert the "optimistic" IR into the "conservative" choice (again at the IR level), which we know it can handle. We focus on "rm" for the initial commit, but that can be expanded in the future for other constraints where Clang generates ++ungood code (like "g"). Fixes: https://github.com/llvm/llvm-project/issues/20571	2026-02-14 05:02:24 -08:00
Wei Wang	9dde0a803b	[SampleProf][OMP] Handle OMP helper function name canonicalization (#178339 ) Fix an issue where `FunctionSamples::getCanonicalFnName` incorrectly canonicalizes omp helper functions to collide with the original function itself. This causes the sample loader to annotate the wrong functions. Canonicalization strips everything comes after the first dot (.), unless the function attribute "sample-profile-suffix-elision-policy" is set to "selected", in which case it only strips after the known suffixes. The helper function names have the suffixes like `.omp_outlined`. After canonicalization, the name becomes the same as the original function. Add the attribute to helper functions so that the suffixes are not stripped. This is the same fix applied previously to coroutine await suspend wrapper functions (#174881).	2026-01-30 11:43:10 -08:00
NAKAMURA Takumi	f86fab6105	[Coverage][Single] Enable Branch coverage for IfStmt (#113111 ) Depends on: #112730 #113114 https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492	2026-01-29 13:30:49 +09:00
NAKAMURA Takumi	ea509d2857	[Coverage][Single] Enable Branch coverage for SwitchStmt (#113112 ) Depends on: #112730 #113114 https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492	2026-01-29 10:33:58 +09:00
NAKAMURA Takumi	599c2a0063	[Coverage][Single] Enable Branch coverage for loop statements (#113109 ) Depends on: #112730 #113114 https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492	2026-01-29 07:46:19 +09:00
Erich Keane	04c83c3498	[NFCI] Extract out the addVariableConstraints CGASM Function (#175261 ) This function is needed in identical form for CIR codegen, and pulling it out into AsmStmt is effectively trivial. The only thing that actually needs the codegen in it is the ability to diagnose, so this patch adds that as a callback. AsmStmt seems to be the most logical place for this to happen, as it does other similar things. Howver, unlike the other similar things, th is is the same between MS and GCC, so it doesn't need separate implementations.	2026-01-14 06:31:13 -08:00
Sirraide	71bfdd1304	[Clang] Add support for the C `_Defer` TS (#162848 ) This implements WG14 N3734 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3734.pdf), aka `_Defer`; it is currently only supported in C if `-fdefer-ts` is passed.	2025-12-11 05:54:09 +01:00
KaiWeng	d9c7c76269	Revert "Ignore trailing NullStmts in StmtExprs for GCC compatibility." (#166036 ) This reverts commit b1e511bf5a4c702ace445848b30070ac2e021241. https://github.com/llvm/llvm-project/issues/160243 Reverting because the GCC C front end is incorrect. --------- Co-authored-by: Jim Lin <jim@andestech.com>	2025-11-07 09:30:53 -05:00
anoopkg6	6712e20c52	Add support for flag output operand "=@cc" for SystemZ. (#125970 ) Added Support for flag output operand "=@cc", inline assembly constraint for SystemZ. - Clang now accepts "=@cc" assembly operands, and sets 2-bits condition code for output operand for SyatemZ. - Clang currently emits an assertion that flag output operands are boolean values, i.e. in the range [0, 2). Generalize this mechanism to allow targets to specify arbitrary range assertions for any inline assembly output operand. This will be used to assert that SystemZ two-bit condition-code values are in the range [0, 4). - SystemZ backend lowers "@cc" targets by using ipm sequence to extract condition code from PSW. - DAGCombine tries to optimize lowered ipm sequence by combining CCReg and computing effective CCMask and CCValid in combineCCMask for select_ccmask and br_ccmask. - Cost computation is done for merging conditionals for branch instruction in SelectionDAG, as split may cause branches conditions evaluation goes across basic block and difficult to combine. --------- Co-authored-by: anoopkg6 <anoopkg6@github.com> Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>	2025-10-14 11:53:42 +02:00
Walter J.T.V	cd4c5280c7	[Clang][OpenMP][LoopTransformations] Implement "#pragma omp fuse" loop transformation directive and "looprange" clause (#139293 ) This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang. This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending). --------- Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>	2025-09-29 07:48:18 +02:00
Iris Shi	ddfbfd6b58	[NFC][clang] Move simplifyConstraint to TargetInfo.cpp (#154905 ) Co-authored-by: Andy Kaylor <akaylor@nvidia.com>	2025-09-28 10:07:27 +02:00
Jongmyeong Choi	60b3cc69af	[CodeGen] Fix cleanup attribute for C89 for-loop init variables (#156643 ) In C89, for-init variables have function scope, so cleanup should occur at function exit, not loop exit. This implements deferred cleanup registration for C89 mode while preserving C99+ behavior. Fixes #154624	2025-09-23 20:35:43 -07:00
Sirraide	e4a1b5f36e	[Clang] [C2y] Implement N3355 ‘Named Loops’ (#152870 ) This implements support for [named loops](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3355.htm) for C2y. When parsing a `LabelStmt`, we create the `LabeDecl` early before we parse the substatement; this label is then passed down to `ParseWhileStatement()` and friends, which then store it in the loop’s (or switch statement’s) `Scope`; when we encounter a `break/continue` statement, we perform a lookup for the label (and error if it doesn’t exist), and then walk the scope stack and check if there is a scope whose preceding label is the target label, which identifies the jump target. The feature is only supported in C2y mode, though a cc1-only option exists for testing (`-fnamed-loops`), which is mostly intended to try and make sure that we don’t have to refactor this entire implementation when/if we start supporting it in C++. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2025-09-02 16:37:19 +00:00
Matheus Izvekov	91cdd35008	[clang] Improve nested name specifier AST representation (#147835 ) This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757	2025-08-09 05:06:53 -03:00
Orlando Cazalet-Hyams	bbe912f1e7	[KeyInstr] Inline asm atoms (#149076 )	2025-07-22 17:19:58 +01:00
Orlando Cazalet-Hyams	5c7c8558c8	[KeyInstr] goto stmt atoms (#149101 )	2025-07-21 11:09:40 +01:00
Kazu Hirata	ae372bfca8	[CodeGen] Use range-based for loops (NFC) (#145142 )	2025-06-21 08:20:57 -07:00
Orlando Cazalet-Hyams	54d544b831	[KeyInstr][Clang] Ret atom (#134652 ) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. When returning a value, stores to the `retval` allocas and branches to `return` block are put in the same atom group. They are both rank 1, which could in theory introduce an extra step in some optimized code. This low risk currently feels an acceptable for keeping the code a bit simpler (as opposed to adding scaffolding to make the store rank 2). In the case of a single return (no control flow) the return instruction inherits the atom group of the branch to the return block when the blocks get folded togather. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-06-04 15:43:49 +01:00
Orlando Cazalet-Hyams	ac42923c2d	Reapply "[KeyInstr][Clang] For range stmt atoms" (#142630 ) This reverts commit e6529dcedb3955706a8af5710591f1ac1bac26a3 with crash fixed. Original PR https://github.com/llvm/llvm-project/pull/134647 This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-06-04 10:53:29 +01:00
Orlando Cazalet-Hyams	e6529dcedb	Revert "[KeyInstr][Clang] For range stmt atoms" (#142630 ) Reverts llvm/llvm-project#134647 Bot failure: https://lab.llvm.org/buildbot/#/builders/144/builds/26730/steps/6/logs/FAIL__Clang__terminate-statements_cpp	2025-06-03 16:15:46 +01:00
Orlando Cazalet-Hyams	10024363dd	[KeyInstr][Clang] For range stmt atoms (#134647 ) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-06-03 15:44:15 +01:00
Orlando Cazalet-Hyams	8e50e882a8	[KeyInstr][Clang] Break and Continue stmt atoms [KeyInstr][Clang] For stmt atom (#134646) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-06-03 14:25:48 +01:00
Orlando Cazalet-Hyams	0555594195	[KeyInstr][Clang] For stmt atom (#134646 ) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-06-03 13:47:32 +01:00
Nikita Popov	e2b536431d	[CodeGen] Move CodeGenPGO behind unique_ptr (NFC) (#142155 ) The InstrProf headers are very expensive. Avoid including them in all of CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr. This reduces clang build time by 0.8%.	2025-06-02 09:51:54 +02:00
Orlando Cazalet-Hyams	dd8eb1e673	[KeyInstr][Clang] Switch stmt atom (#134643 ) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-05-27 11:26:40 +01:00
Orlando Cazalet-Hyams	6bd3543a4d	[KeyInstr][Clang] While stmt atom (#134645 ) See test comment for possible future improvement. This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-05-23 14:42:28 +01:00
Orlando Cazalet-Hyams	189d5dad36	[KeyInstr][Clang] Do stmt atom (#134644 ) See test comment for possible future improvement. This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.	2025-05-23 14:31:18 +01:00
joaosaffran	567b0f8923	[HLSL] Add support to branch/flatten attributes to switch (#131739 ) closes: [#125754](https://github.com/llvm/llvm-project/issues/125754) --------- Co-authored-by: joaosaffran <joao.saffran@microsoft.com>	2025-03-24 16:17:19 -07:00
cor3ntin	911b200ce3	[Clang] Constant Expressions inside of GCC' asm strings (#131003 ) Implements GCC's constexpr string ASM extension https://gcc.gnu.org/onlinedocs/gcc/Asm-constexprs.html	2025-03-17 20:10:46 +01:00
Younan Zhang	f4218753ad	[Clang] Implement P0963R3 "Structured binding declaration as a condition" (#130228 ) This implements the R2 semantics of P0963. The R1 semantics, as outlined in the paper, were introduced in Clang 6. In addition to that, the paper proposes swapping the evaluation order of condition expressions and the initialization of binding declarations (i.e. std::tuple-like decompositions).	2025-03-11 15:41:56 +08:00
erichkeane	d5cec386c1	[OpenACC] Implement 'cache' construct AST/Sema This statement level construct takes no clauses and has no associated statement, and simply labels a number of array elements as valid for caching. The implementation here is pretty simple, but it is a touch of a special case for parsing, so the parsing code reflects that.	2025-03-03 13:57:23 -08:00
Yaxun (Sam) Liu	240f2269ff	Add clang atomic control options and attribute (#114841 ) Add option and statement attribute for controlling emitting of target-specific metadata to atomicrmw instructions in IR. The RFC for this attribute and option is https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641, Originally a pragma was proposed, then it was changed to clang attribute. This attribute allows users to specify one, two, or all three options and must be applied to a compound statement. The attribute can also be nested, with inner attributes overriding the options specified by outer attributes or the target's default options. These options will then determine the target-specific metadata added to atomic instructions in the IR. In addition to the attribute, three new compiler options are introduced: `-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`, `-f[no-]atomic-ignore-denormal-mode`. These compiler options allow users to override the default options through the Clang driver and front end. `-m[no-]unsafe-fp-atomics` is aliased to `-f[no-]ignore-denormal-mode`. In terms of implementation, the atomic attribute is represented in the AST by the existing AttributedStmt, with minimal changes to AST and Sema. During code generation in Clang, the CodeGenModule maintains the current atomic options, which are used to emit the relevant metadata for atomic instructions. RAII is used to manage the saving and restoring of atomic options when entering and exiting nested AttributedStmt.	2025-02-27 10:41:04 -05:00
Zahira Ammarguellat	cf69b4c668	[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#126927 ) This patch was reviewed and approved here: https://github.com/llvm/llvm-project/pull/119891 However it has been reverted here: `083df25dc2` due to a build issue here: https://lab.llvm.org/buildbot/#/builders/51/builds/10694 This patch is reintroducing the support.	2025-02-13 07:14:36 -05:00
Sameer Sahasrabuddhe	b85e71b9f2	[llvm] Create() functions for ConvergenceControlInst (#125627 )	2025-02-05 11:41:26 +05:30
erichkeane	99a9133a68	[OpenACC] Implement Sema/AST for 'atomic' construct The atomic construct is a particularly complicated one. The directive itself is pretty simple, it has 5 options for the 'atomic-clause'. However, the associated statement is fairly complicated. 'read' accepts: v = x; 'write' accepts: x = expr; 'update' (or no clause) accepts: x++; x--; ++x; --x; x binop= expr; x = x binop expr; x = expr binop x; 'capture' accepts either a compound statement, or: v = x++; v = x--; v = ++x; v = --x; v = x binop= expr; v = x = x binop expr; v = x = expr binop x; IF 'capture' has a compound statement, it accepts: {v = x; x binop= expr; } {x binop= expr; v = x; } {v = x; x = x binop expr; } {v = x; x = expr binop x; } {x = x binop expr ;v = x; } {x = expr binop x; v = x; } {v = x; x = expr; } {v = x; x++; } {v = x; ++x; } {x++; v = x; } {++x; v = x; } {v = x; x--; } {v = x; --x; } {x--; v = x; } {--x; v = x; } While these are all quite complicated, there is a significant amount of similarity between the 'capture' and 'update' lists, so this patch reuses a lot of the same functions. This patch implements the entirety of 'atomic', creating a new Sema file for the sema for it, as it is fairly sizable.	2025-02-03 07:22:22 -08:00
Tom Honermann	8fb42300a0	[SYCL] AST support for SYCL kernel entry point functions. (#122379 ) A SYCL kernel entry point function is a non-member function or a static member function declared with the `sycl_kernel_entry_point` attribute. Such functions define a pattern for an offload kernel entry point function to be generated to enable execution of a SYCL kernel on a device. A SYCL library implementation orchestrates the invocation of these functions with corresponding SYCL kernel arguments in response to calls to SYCL kernel invocation functions specified by the SYCL 2020 specification. The offload kernel entry point function (sometimes referred to as the SYCL kernel caller function) is generated from the SYCL kernel entry point function by a transformation of the function parameters followed by a transformation of the function body to replace references to the original parameters with references to the transformed ones. Exactly how parameters are transformed will be explained in a future change that implements non-trivial transformations. For now, it suffices to state that a given parameter of the SYCL kernel entry point function may be transformed to multiple parameters of the offload kernel entry point as needed to satisfy offload kernel argument passing requirements. Parameters that are decomposed in this way are reconstituted as local variables in the body of the generated offload kernel entry point function. For example, given the following SYCL kernel entry point function definition: ``` template<typename KernelNameType, typename KernelType> [[clang::sycl_kernel_entry_point(KernelNameType)]] void sycl_kernel_entry_point(KernelType kernel) { kernel(); } ``` and the following call: ``` struct Kernel { int dm1; int dm2; void operator()() const; }; Kernel k; sycl_kernel_entry_point<class kernel_name>(k); ``` the corresponding offload kernel entry point function that is generated might look as follows (assuming `Kernel` is a type that requires decomposition): ``` void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) { Kernel kernel{dm1, dm2}; kernel(); } ``` Other details of the generated offload kernel entry point function, such as its name and calling convention, are implementation details that need not be reflected in the AST and may differ across target devices. For that reason, only the transformation described above is represented in the AST; other details will be filled in during code generation. These transformations are represented using new AST nodes introduced with this change. `OutlinedFunctionDecl` holds a sequence of `ImplicitParamDecl` nodes and a sequence of statement nodes that correspond to the transformed parameters and function body. `SYCLKernelCallStmt` wraps the original function body and associates it with an `OutlinedFunctionDecl` instance. For the example above, the AST generated for the `sycl_kernel_entry_point<kernel_name>` specialization would look as follows: ``` FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)' TemplateArgument type 'kernel_name' TemplateArgument type 'Kernel' ParmVarDecl kernel 'Kernel' SYCLKernelCallStmt CompoundStmt <original statements> OutlinedFunctionDecl ImplicitParamDecl 'dm1' 'int' ImplicitParamDecl 'dm2' 'int' CompoundStmt VarDecl 'kernel' 'Kernel' <initialization of 'kernel' with 'dm1' and 'dm2'> <transformed statements with redirected references of 'kernel'> ``` Any ODR-use of the SYCL kernel entry point function will (with future changes) suffice for the offload kernel entry point to be emitted. An actual call to the SYCL kernel entry point function will result in a call to the function. However, evaluation of a `SYCLKernelCallStmt` statement is a no-op, so such calls will have no effect other than to trigger emission of the offload kernel entry point. Additionally, as a related change inspired by code review feedback, these changes disallow use of the `sycl_kernel_entry_point` attribute with functions defined with a _function-try-block_. The SYCL 2020 specification prohibits the use of C++ exceptions in device functions. Even if exceptions were not prohibited, it is unclear what the semantics would be for an exception that escapes the SYCL kernel entry point function; the boundary between host and device code could be an implicit noexcept boundary that results in program termination if violated, or the exception could perhaps be propagated to host code via the SYCL library. Pending support for C++ exceptions in device code and clear semantics for handling them at the host-device boundary, this change makes use of the `sycl_kernel_entry_point` attribute with a function defined with a _function-try-block_ an error.	2025-01-22 16:39:08 -05:00
CHANDRA GHALE	30f9a4f754	[OpenMP] codegen support for masked combined construct parallel masked taskloop simd. (#121746 ) Added codegen support for combined masked constructs `Parallel masked taskloop simd`. Added implementation for `EmitOMPParallelMaskedTaskLoopSimdDirective`. Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-14 18:26:46 +05:30
joaosaffran	380bb51b70	[HLSL] Adding Flatten and Branch if attributes with test fixes (#122157 ) - Adding the changes from PRs: - #116331 - #121852 - Fixes test `tools/dxil-dis/debug-info.ll` - Address some missed comments in the previous PR --------- Co-authored-by: joaosaffran <joao.saffran@microsoft.com>	2025-01-13 10:31:25 -08:00
CHANDRA GHALE	6f558e0e12	[OpenMP] codegen support for masked combined construct masked taskloop (#121914 ) Added codegen support for combined masked constructs `masked taskloop.` Added implementation for `EmitOMPMaskedTaskLoopDirective`. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-13 11:42:13 +05:30
CHANDRA GHALE	1d2eea962a	[OpenMP] codegen support for masked combined construct masked taskloop simd (#121916 ) Added codegen support for combined masked constructs `masked taskloop simd`. Added implementation for `EmitOMPMaskedTaskLoopSimdDirective`. Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-12 23:38:00 +05:30
CHANDRA GHALE	aedb30fdc7	[OpenMP] codegen support for masked combined construct parallel masked taskloop (#121741 ) Added codegen support for combined masked constructs Parallel masked taskloop. Added implementation for EmitOMPParallelMaskedTaskLoopDirective. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>	2025-01-09 16:38:36 +05:30
NAKAMURA Takumi	397ac44f62	[Coverage] Introduce the type `CounterPair` for RegionCounterMap. NFC. (#112724 ) `CounterPair` can hold `<uint32_t, uint32_t>` instead of current `unsigned`, to hold also the counter number of SkipPath. For now, this change provides the skeleton and only `CounterPair::Executed` is used. Each counter number can have `None` to suppress emitting counter increment. 2nd element `Skipped` is initialized as `None` by default, since most `Stmt` don't have a pair of counters. This change also provides stubs for the verifier. I'll provide the impl of verifier for `+Asserts` later. `markStmtAsUsed(bool, Stmt)` may be used to inform that other side counter may not emitted. `markStmtMaybeUsed(S)` may be used for the `Stmt` and its inner will be excluded for emission in the case of skipping by constant folding. I put it into places where I found. `verifyCounterMap()` will check the coverage map and the counter map, and can be used to report inconsistency. These verifier methods shall be eliminated in `-Asserts`. https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492	2025-01-09 17:11:07 +09:00
Chris B	b66f6b25cb	Revert #116331 & #121852 (#122105 )	2025-01-08 08:55:02 -06:00
erichkeane	db81e8c42e	[OpenACC] Initial sema implementation of 'update' construct This executable construct has a larger list of clauses than some of the others, plus has some additional restrictions. This patch implements the AST node, plus the 'cannot be the body of a if, while, do, switch, or label' statement restriction. Future patches will handle the rest of the restrictions, which are based on clauses.	2025-01-07 08:20:20 -08:00
erichkeane	21c785d7bd	[OpenACC] Implement 'set' construct sema The 'set' construct is another fairly simple one, it doesn't have an associated statement and only a handful of allowed clauses. This patch implements it and all the rules for it, allowing 3 of its for clauses. The only exception is default_async, which will be implemented in a future patch, because it isn't just being enabled, it needs a complete new implementation.	2025-01-06 11:03:18 -08:00
joaosaffran	0d5c07285f	[HLSL] Adding Flatten and Branch if attributes (#116331 ) - adding Flatten and Branch to if stmt. - adding dxil control flow hint metadata generation - modifing spirv OpSelectMerge to account for the specific attributes. Closes #70112 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com> Co-authored-by: joaosaffran <joao.saffran@microsoft.com>	2025-01-06 10:27:02 -08:00
Sameer Sahasrabuddhe	df67e37e37	[clang][NFC] clean up the handling of convergence control tokens (#121738 )	2025-01-06 21:34:11 +05:30
erichkeane	4bbdb018a6	[OpenACC] Implement 'init' and 'shutdown' constructs These two constructs are very simple and similar, and only support 3 different clauses, two of which are already implemented. This patch adds AST nodes for both constructs, and leaves the device_num clause unimplemented, but enables the other two.	2024-12-19 12:21:50 -08:00

1 2 3 4 5 ...

765 Commits