llvm-project

Author	SHA1	Message	Date
Bill Wendling	712d7dba4f	[Clang] Improve testing for the flexible array member (#89462 ) Testing for the name of the flexible array member isn't as robust as testing the FieldDecl pointers.	2024-04-24 19:39:33 +00:00
Utkarsh Saxena	9d8be24087	Revert "[codegen] Emit missing cleanups for stmt-expr and coro suspensions" and related commits (#88884 ) The original change caused widespread breakages in msan/ubsan tests and causes `use-after-free`. Most likely we are adding more cleanups than necessary.	2024-04-16 15:30:32 +02:00
Utkarsh Saxena	5a46123ddf	Fix missing dtor in function calls accepting trivial ABI structs (#88751 ) Fixes https://github.com/llvm/llvm-project/issues/88478 Promoting the `EHCleanup` to `NormalAndEHCleanup` in `EmitCallArgs` surfaced another bug with deactivation of normal cleanups. Here we missed emitting CPP scope ends for deactivated normal cleanups. This patch also fixes that bug. We missed emitting CPP scope ends because we remove the `fallthrough` (clears the insertion point) before deactivating normal cleanups. This is to make the emitted "normal" cleanup code unreachable. But we still need to emit CPP scope ends in the original basic block even for a deactivated normal cleanup. (This worked correctly before we did not remove `fallthrough` for `EHCleanup`s).	2024-04-16 11:01:03 +02:00
Utkarsh Saxena	89ba7e183e	[codegen] Emit missing cleanups for stmt-expr and coro suspensions [take-2] (#85398 ) Fixes https://github.com/llvm/llvm-project/issues/63818 for control flow out of an expressions. #### Background A control flow could happen in the middle of an expression due to stmt-expr and coroutine suspensions. Due to branch-in-expr, we missed running cleanups for the temporaries constructed in the expression before the branch. Previously, these cleanups were only added as `EHCleanup` during the expression and as normal expression after the full expression. Examples of such deferred cleanups include: `ParenList/InitList`: Cleanups for fields are performed by the destructor of the object being constructed. `Array init`: Cleanup for elements of an array is included in the array cleanup. `Lifetime-extended temporaries`: reference-binding temporaries in braced-init are lifetime extended to the parent scope. `Lambda capture init`: init in the lambda capture list is destroyed by the lambda object. --- #### In this PR In this PR, we change some of the `EHCleanups` cleanups to `NormalAndEHCleanups` to make sure these are emitted when we see a branch inside an expression (through statement expressions or coroutine suspensions). These are supposed to be deactivated after full expression and destroyed later as part of the destructor of the aggregate or array being constructed. To simplify deactivating cleanups, we add two utilities as well: * `DeferredDeactivationCleanupStack`: A stack to remember cleanups with deferred deactivation. * `CleanupDeactivationScope`: RAII for deactivating cleanups added to the above stack. --- #### Deactivating normal cleanups These were previously `EHCleanups` and not `Normal` and deactivation of required `Normal` cleanups had some bugs. These specifically include deactivating `Normal` cleanups which are not the top of `EHStack` [source1](`92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L1319)`), [2](`92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L722-L746)`). This has not been part of our test suite (maybe it was never required before statement expressions). In this PR, we also fix the emission of required-deactivated-normal cleanups.	2024-04-10 12:59:24 +02:00
Axel Lundberg	708c8cd743	Fix "[clang][UBSan] Add implicit conversion check for bitfields" (#87761 ) Fix since #75481 got reverted. - Explicitly set BitfieldBits to 0 to avoid uninitialized field member for the integer checks: ```diff - llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)}; + llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first), + llvm::ConstantInt::get(Builder.getInt32Ty(), 0)}; ``` - `Value *Previous` was erroneously `Value Previous` in `CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now. - Update following: ```diff - if (Kind == CK_IntegralCast) { + if (Kind == CK_IntegralCast \|\| Kind == CK_LValueToRValue) { ``` CK_LValueToRValue when going from, e.g., char to char, and CK_IntegralCast otherwise. - Make sure that `Value *Previous = nullptr;` is initialized (see `1189e87951`) - Add another extensive testcase `ubsan/TestCases/ImplicitConversion/bitfield-conversion.c` --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>	2024-04-08 12:30:27 -07:00
Vitaly Buka	029e1d7515	Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields""" (#87562 ) Reverts llvm/llvm-project#87529 Reverts #87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken	2024-04-03 15:19:03 -07:00
Vitaly Buka	8a5a1b7704	Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields"" (#87529 ) Reverts llvm/llvm-project#87518 Revert is not needed as the regression was fixed with 1189e87951e59a81ee097eae847c06008276fef1. I assumed the crash and warning are different issues, but according to https://lab.llvm.org/buildbot/#/builders/240/builds/26629 fixing warning resolves the crash.	2024-04-03 10:58:39 -07:00
Vitaly Buka	5822ca5a01	Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#87518 ) Reverts llvm/llvm-project#75481 Breaks multiple bots, see #75481	2024-04-03 10:27:09 -07:00
Axel Lundberg	450f1952ac	[clang][UBSan] Add implicit conversion check for bitfields (#75481 ) This patch implements the implicit truncation and implicit sign change checks for bitfields using UBSan. E.g., `-fsanitize=implicit-bitfield-truncation` and `-fsanitize=implicit-bitfield-sign-change`.	2024-04-03 08:55:03 -04:00
Nathan Gauër	0f61051f54	[clang][HLSL][SPRI-V] Add convergence intrinsics (#80680 ) HLSL has wave operations and other kind of function which required the control flow to either be converged, or respect certain constraints as where and how to re-converge. At the HLSL level, the convergence are mostly obvious: the control flow is expected to re-converge at the end of a scope. Once translated to IR, HLSL scopes disapear. This means we need a way to communicate convergence restrictions down to the backend. For this, the SPIR-V backend uses convergence intrinsics. So this commit adds some code to generate convergence intrinsics when required. --------- Signed-off-by: Nathan Gauër <brioche@google.com>	2024-03-28 17:18:05 +01:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Akira Hatanaka	f75eebab88	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 )" (#86898 ) This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c. The commit broke ubsan bots.	2024-03-27 18:14:04 -07:00
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
mikaoP	4e3310a813	[clang] Fix OMPT ident flag in combined distribute parallel for pragma (#80987 ) Authored-by: Raúl Peñacoba Veigas <rpenacob@bsc.es>	2024-03-12 07:50:35 -04:00
fpasserby	f786881340	[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712 ) Implement `llvm.coro.await.suspend` intrinsics, to deal with performance regression after prohibiting `.await_suspend` inlining, as suggested in #64945. Actually, there are three new intrinsics, which directly correspond to each of three forms of `await_suspend`: ``` void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction) i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction) ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction) ``` There are three different versions instead of one, because in `bool` case it's result is used for resuming via a branch, and in `coroutine_handle` case exceptions from `await_suspend` are handled in the coroutine, and exceptions from the subsequent `.resume()` are propagated to the caller. Await-suspend block is simplified down to intrinsic calls only, for example for symmetric transfer: ``` %id = call token @llvm.coro.save(ptr null) %handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction) call void @llvm.coro.resume(%handle) %result = call i8 @llvm.coro.suspend(token %id, i1 false) switch i8 %result, ... ``` All await-suspend logic is moved out into a wrapper function, generated for each suspension point. The signature of the function is `<type> wrapperFunction(ptr %awaiter, ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on the return type of `await_suspend`. Intrinsic calls are lowered during `CoroSplit` pass, right after the split. Because I'm new to LLVM, I'm not sure if the helper function generation, calls to them and lowering are implemented in the right way, especially with regard to various metadata and attributes, i. e. for TBAA. All things that seemed questionable are marked with `FIXME` comments. There is another detail: in case of symmetric transfer raw pointer to the frame of coroutine, that should be resumed, is returned from the helper function and a direct call to `@llvm.coro.resume` is generated. C++ standard demands, that `.resume()` method is evaluated. Not sure how important is this, because code has been generated in the same way before, sans helper function.	2024-03-11 10:00:00 +08:00
gulfemsavrun	23f895f656	[InstrProf] Single byte counters in coverage (#75425 ) This patch inserts 1-byte counters instead of an 8-byte counters into llvm profiles for source-based code coverage. The origial idea was proposed as block-cov for PGO, and this patch repurposes that idea for coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 The current 8-byte counters mechanism add counters to minimal regions, and infer the counters in the remaining regions via adding or subtracting counters. For example, it infers the counter in the if.else region by subtracting the counters between if.entry and if.then regions in an if statement. Whenever there is a control-flow merge, it adds the counters from all the incoming regions. However, we are not going to be able to infer counters by subtracting two execution counts when using single-byte counters. Therefore, this patch conservatively inserts additional counters for the cases where we need to add or subtract counters. RFC: https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685	2024-02-26 14:44:55 -08:00
Farzon Lotfi	82acec15af	[HLSL] Implementation of dot intrinsic (#81190 ) This change implements https://github.com/llvm/llvm-project/issues/70073 HLSL has a dot intrinsic defined here: https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-dot The intrinsic itself is defined as a HLSL_LANG LangBuiltin in Builtins.td. This is used to associate all the dot product typdef defined hlsl_intrinsics.h with a single intrinsic check in CGBuiltin.cpp & SemaChecking.cpp. In IntrinsicsDirectX.td we define the llvmIR for the dot product. A few goals were in mind for this IR. First it should operate on only vectors. Second the return type should be the vector element type. Third the second parameter vector should be of the same size as the first parameter. Finally `a dot b` should be the same as `b dot a`. In CGBuiltin.cpp hlsl has built on top of existing clang intrinsics via EmitBuiltinExpr. Dot product though is language specific intrinsic and so is guarded behind getLangOpts().HLSL. The call chain looks like this: EmitBuiltinExpr -> EmitHLSLBuiltinExp EmitHLSLBuiltinExp dot product intrinsics makes a destinction between vectors and scalars. This is because HLSL supports dot product on scalars which simplifies down to multiply. Sema.h & SemaChecking.cpp saw the addition of CheckHLSLBuiltinFunctionCall, a language specific semantic validation that can be expanded for other hlsl specific intrinsics. Fixes #70073	2024-02-26 10:08:59 -06:00
Pavel Iliin	568babab7e	[AArch64] Implement __builtin_cpu_supports, compiler-rt tests. (#82378 ) The patch complements https://github.com/llvm/llvm-project/pull/68919 and adds AArch64 support for builtin `__builtin_cpu_supports("feature1+...+featureN")` which return true if all specified CPU features in argument are detected. Also compiler-rt aarch64 native run tests for features detection mechanism were added and 'cpu_model' check was fixed after its refactor merged https://github.com/llvm/llvm-project/pull/75635 Original RFC was https://reviews.llvm.org/D153153	2024-02-22 23:33:54 +00:00
Erich Keane	f655778300	[OpenACC] Implement AST for OpenACC Compute Constructs (#81188 ) 'serial', 'parallel', and 'kernel' constructs are all considered 'Compute' constructs. This patch creates the AST type, plus the required infrastructure for such a type, plus some base types that will be useful in the future for breaking this up. The only difference between the three is the 'kind'( plus some minor clause legalization rules, but those can be differentiated easily enough), so rather than representing them as separate AST nodes, it seems to make sense to make them the same. Additionally, no clause AST functionality is being implemented yet, as that fits better in a separate patch, and this is enough to get the 'naked' constructs implemented. This is otherwise an 'NFC' patch, as it doesn't alter execution at all, so there aren't any tests. I did this to break up the review workload and to get feedback on the layout.	2024-02-13 06:02:13 -08:00
Vlad Serebrennikov	35737beaef	[clang][NFC] Annotate `CodeGenFunction.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:11:49 +03:00
Bill Wendling	00b6d032a2	[Clang] Implement the 'counted_by' attribute (#76348 ) The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' / }; struct { / ... / struct bar array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; / The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```	2024-01-16 14:26:12 -08:00
Rashmi Mudduluru	a511c1a9ec	Revert "[Clang] Implement the 'counted_by' attribute (#76348 )" This reverts commit 164f85db876e61cf4a3c34493ed11e8f5820f968.	2024-01-15 18:37:52 -08:00
Bill Wendling	164f85db87	[Clang] Implement the 'counted_by' attribute (#76348 ) The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' / }; struct { / ... / struct bar array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; / The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```	2024-01-10 22:20:31 -08:00
Nico Weber	2dce77201c	Revert "[Clang] Implement the 'counted_by' attribute (#76348 )" This reverts commit fefdef808c230c79dca2eb504490ad0f17a765a5. Breaks check-clang, see https://github.com/llvm/llvm-project/pull/76348#issuecomment-1886029515 Also revert follow-on "[Clang] Update 'counted_by' documentation" This reverts commit 4a3fb9ce27dda17e97341f28005a28836c909cfc.	2024-01-10 21:05:19 -05:00
Bill Wendling	fefdef808c	[Clang] Implement the 'counted_by' attribute (#76348 ) The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' / }; struct { / ... / struct bar array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; / The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```	2024-01-10 15:21:10 -08:00
Alan Phipps	8b2bdfbca7	[Coverage][clang] Enable MC/DC Support in LLVM Source-based Code Coverage (3/3) Part 3 of 3. This includes the MC/DC clang front-end components. Differential Revision: https://reviews.llvm.org/D138849	2024-01-04 12:29:18 -06:00
Bill Wendling	cca4d6cfd2	Revert counted_by attribute feature (#75857 ) There are many issues that popped up with the counted_by feature. The patch #73730 has grown too large and approval is blocking Linux testing. Includes reverts of: commit 769bc11f684d ("[Clang] Implement the 'counted_by' attribute (#68750)") commit bc09ec696209 ("[CodeGen] Revamp counted_by calculations (#70606)") commit 1a09cfb2f35d ("[Clang] counted_by attr can apply only to C99 flexible array members (#72347)") commit a76adfb992c6 ("[NFC][Clang] Refactor code to calculate flexible array member size (#72790)") commit d8447c78ab16 ("[Clang] Correct handling of negative and out-of-bounds indices (#71877)") Partial commit b31cd07de5b7 ("[Clang] Regenerate test checks (NFC)") Closes #73168 Closes #75173	2023-12-18 15:16:09 -08:00
Bill Wendling	a76adfb992	[NFC][Clang] Refactor code to calculate flexible array member size (#72790 ) The code that calculates the flexible array member size is big enough to warrant its own method.	2023-11-19 19:25:10 -08:00
Joseph Huber	237adfca4e	[OpenMP] Rework handling of global ctor/dtors in OpenMP (#71739 ) Summary: This patch reworks how we handle global constructors in OpenMP. Previously, we emitted individual kernels that were all registered and called individually. In order to provide more generic support, this patch moves all handling of this to the target backend and the runtime plugin. This has the benefit of supporting the GNU extensions for constructors an destructors, removing a class of failures related to shared library destruction order, and allows targets other than OpenMP to use the same support without needing to change the frontend. This is primarily done by calling kernels that the backend emits to iterate a list of ctor / dtor functions. For x64, this is automatic and we get it for free with the standard `dlopen` handling. For AMDGPU, we emit `amdgcn.device.init` and `amdgcn.device.fini` functions which handle everything atuomatically and simply need to be called. For NVPTX, a patch https://github.com/llvm/llvm-project/pull/71549 provides the kernels to call, but the runtime needs to set up the array manually by pulling out all the known constructor / destructor functions. One concession that this patch requires is the change that for GPU targets in OpenMP offloading we will use `llvm.global_dtors` instead of using `atexit`. This is because `atexit` is a separate runtime function that does not mesh well with the handling we're trying to do here. This should be equivalent in all cases except for cases where we would need to destruct manually such as: ``` struct S { ~S() { foo(); } }; void foo() { static S s; } ``` However this is broken in many other ways on the GPU, so it is not regressing any support, simply increasing the scope of what we can handle. This changes the handling of ctors / dtors. This patch now outputs a information message regarding the deprecation if the old format is used. This will be completely removed in a later release. Depends on: https://github.com/llvm/llvm-project/pull/71549	2023-11-10 14:53:53 -06:00
Bill Wendling	bc09ec6962	[CodeGen] Revamp counted_by calculations (#70606 ) Break down the counted_by calculations so that they correctly handle anonymous structs, which are specified internally as IndirectFieldDecls. Improves the calculation of __bdos on a different field member in the struct. And also improves support for __bdos in an index into the FAM. If the index is further out than the length of the FAM, then we return __bdos's "can't determine the size" value (zero or negative one, depending on type). Also simplify the code to use helper methods to get the field referenced by counted_by and the flexible array member itself, which also had some issues with FAMs in sub-structs.	2023-11-09 10:18:17 -08:00
Pravin Jagtap	1f21e49870	Revert "Revert "[AMDGPU] const-fold imm operands of (#71669 ) amdgcn_update_dpp intrinsic (#71139)"" This reverts commit d1fb9307951319eea3e869d78470341d603c8363 and fixes the lit test clang/test/CodeGenHIP/dpp-const-fold.hip --------- Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>	2023-11-09 10:09:22 +05:30
Mitch Phillips	d1fb930795	Revert "[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139 )" This reverts commit 32a3f2afe6ea7ffb02a6a188b123ded6f4c89f6c. Reason: Broke the sanitizer buildbots. More details at `32a3f2afe6`	2023-11-08 12:50:53 +01:00
Pravin Jagtap	32a3f2afe6	[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139 ) Operands of `__builtin_amdgcn_update_dpp` need to evaluate to constant to match the intrinsic requirements. Fixes: SWDEV-426822, SWDEV-431138 --------- Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>	2023-11-08 15:09:10 +05:30
Kerry McLaughlin	48fb8ee081	[Clang][SME2] Add multi-vector add/sub builtins (#69725 ) Adds the following SME2 builtins: - sv(add\|sub) - sv(add\|sub)_za32/za64, - sv(add\|sub)_write_za32/za64 Other changes in this patch: - CGBuiltin.cpp: The GetAArch64SMEProcessedOperands function is created to avoid duplicating existing code from EmitAArch64SVEBuiltinExpr. - arm_sve.td: The add/sub SME2 builtins which do not operate on ZA have been added to arm_sve.td, matching the corrosponding LLVM IR intrinsic names which start with @llvm.aarch64.sve for this reason. - SveEmitter.cpp: Adds the createCoreHeaderIntrinsics function to remove duplicated code in createHeader & createSMEHeader. Uses a new enum (ACLEKind) to choose either "__builtin_sme_" or "__builtin_sve_" when emitting the intrinsics. See https://github.com/ARM-software/acle/pull/217/files	2023-11-07 15:42:43 +00:00
Kerry McLaughlin	8f59c168a9	[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70959 ) This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and EmitAArch64SMEBuiltinExpr by creating a new function called GetAArch64SVEProcessedOperands which handles splitting up multi-vector arguments using vector extracts. These changes are non-functional.	2023-11-02 15:47:37 +00:00
Kerry McLaughlin	e2550b7aa0	Revert "[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662 )" This reverts commit c34efe3c2734629b925d9411b3c86a710911a93a.	2023-11-01 15:57:14 +00:00
Kerry McLaughlin	c34efe3c27	[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662 ) This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and EmitAArch64SMEBuiltinExpr by creating a new function called GetAArch64SVEProcessedOperands which handles splitting up multi-vector arguments using vector extracts. These changes are non-functional.	2023-11-01 15:21:08 +00:00
Caroline Concatto	7cad5a9eb4	[Clang][SVE2.1] Add svpext builtins As described in: https://github.com/ARM-software/acle/pull/257 Reviewed By: hassnaa-arm Differential Revision: https://reviews.llvm.org/D151081	2023-10-17 16:15:22 +00:00
Sam Tebbs	4b8f23e93d	[AArch64][SME] Remove immediate argument restriction for svldr and svstr (#68908 ) The svldr_vnum_za and svstr_vnum_za builtins/intrinsics currently require that the vnum argument be an immediate, but since vnum is used to modify the base register via a mul and add, that restriction is not necessary. This patch removes that restriction.	2023-10-17 16:02:36 +01:00
Bill Wendling	769bc11f68	[Clang] Implement the 'counted_by' attribute (#68750 ) The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member in the same structure holding the count of elements in the flexible array. This information can be used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. This example specifies the that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count', specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained through changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar ))); p->count = count + 42; } void use_foo(int index) { p->count += 42; p->array[index] = 0; / The sanitizer cannot properly check this access */ } Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D148381	2023-10-14 04:18:02 -07:00
alexfh	67b675ee55	Revert "[Clang] Implement the 'counted_by' attribute" (#68603 ) This reverts commit 9a954c693573281407f6ee3f4eb1b16cc545033d, which causes clang crashes when compiling with `-fsanitize=bounds`. See `9a954c6935 (commitcomment-129529574)` for details.	2023-10-09 20:53:48 +02:00
Bill Wendling	9a954c6935	[Clang] Implement the 'counted_by' attribute The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member in the same structure holding the count of elements in the flexible array. This information can be used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. This example specifies the that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count', specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained through changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar ))); p->count = count + 42; } void use_foo(int index) { p->count += 42; p->array[index] = 0; / The sanitizer cannot properly check this access */ } Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D148381	2023-10-04 18:26:15 -07:00
Corentin Jabot	af4751738d	[C++] Implement "Deducing this" (P0847R7) This patch implements P0847R7 (partially), CWG2561 and CWG2653. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D140828	2023-10-02 14:33:02 +02:00
Chuanqi Xu	572cc8d38f	Revert "[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty" This reverts commit 9d9c25f81456aace2bec4b58498a420e650007d9. This reverts commit 19ab2664ad3182ffa8fe3a95bb19765e4ae84653. This reverts commit c4672454743e942f148a1aff1e809dae73e464f6. As the issue https://github.com/llvm/llvm-project/issues/65018 shows, the previous fix introduce a regression actually. So this commit reverts the fix by our policies.	2023-08-28 13:21:17 +08:00
Kazu Hirata	5ab7c285fb	[CodeGen] Modernize PeepholeProtection (NFC)	2023-08-27 09:24:28 -07:00
Fangrui Song	7a41af8604	[X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute GCC 12 (https://gcc.gnu.org/PR101696) allows `arch=x86-64` `arch=x86-64-v2` `arch=x86-64-v3` `arch=x86-64-v4` in the target_clones function attribute. This patch ports the feature. * Set KeyFeature to `x86-64{,-v2,-v3,-v4}` in `Processors[]`, to be used by X86TargetInfo::multiVersionSortPriority * builtins: change `__cpu_features2` to an array like libgcc. Define `FEATURE_X86_64_{BASELINE,V2,V3,V4}` and depended ISA feature bits. * CGBuiltin.cpp: update EmitX86CpuSupports to handle `arch=x86-64*`. Close https://github.com/llvm/llvm-project/issues/55830 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158329	2023-08-23 22:08:55 -07:00
Manna, Soumi	30c60ec52f	[NFC][CLANG] Fix static analyzer bugs about large copy by values Static Analyzer Tool complains about a large function call parameter which is is passed by value in CGBuiltin.cpp file. 1. In CodeGenFunction::EmitSMELdrStr(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 2. In CodeGenFunction::EmitSMEZero(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 3. In CodeGenFunction::EmitSMEReadWrite(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. 4. In CodeGenFunction::EmitSMELd1St1(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value > &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value. I see many places in CGBuiltin.cpp file, we are passing parameter TypeFlags of type clang::SVETypeFlags by reference. clang::SVETypeFlags inherits several other types. This patch passes parameter TypeFlags by reference instead of by value in the function. Reviewed By: tahonermann, sdesmalen Differential Revision: https://reviews.llvm.org/D158522	2023-08-23 07:57:04 -07:00
Chuanqi Xu	c467245474	[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151 See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context. As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend call. Also as an optimization, we don't add the `noinline` attribute to the await_suspend call if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common. Another potential optimization is: call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn) Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157833	2023-08-22 09:56:44 +08:00

1 2 3 4 5 ...

1650 Commits