llvm-project

Author	SHA1	Message	Date
Nick Sarnie	93f61ceadb	[clang][NFC] Fix some more incorrectly formatted comments (#138342 ) More fixes based on https://github.com/llvm/llvm-project/pull/138036 --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-05-07 14:25:08 +00:00
Craig Topper	123758b1f4	[IRBuilder] Add versions of createInsertVector/createExtractVector that take a uint64_t index. (#138324 ) Most callers want a constant index. Instead of making every caller create a ConstantInt, we can do it in IRBuilder. This is similar to createInsertElement/createExtractElement.	2025-05-02 16:10:18 -07:00
Nick Sarnie	aa4b44e699	[clang][NFC] Fix some clang-format mistakes (#138036 ) Fixes for https://github.com/llvm/llvm-project/pull/138000 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-05-02 14:12:35 +00:00
Nick Sarnie	92b03e4f04	[clang][NFC] Format two files with CallingConv switches (#138000 ) I'm planning on modifying this code so format it so we can pass the formatting check. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-30 20:52:07 +00:00
Stephen Tozer	2d00c73003	[Clang][CodeGen] Emit fake uses before musttail calls (#136867 ) Fixes the issue reported following the merge of #118026. When a valid `musttail` call is made, the function it is made from must return immediately after the call; if there are any cleanups left in the function, then an error is triggered. This is not necessary for fake uses however - it is perfectly valid to simply emit the fake use "cleanup" code before the tail call, and indeed LLVM will automatically move any fake uses following a tail call to come before the tail call. Therefore, this patch specifically choose to handle fake use cleanups when a musttail call is present by simply emitting them immediately before the call.	2025-04-25 11:47:38 +01:00
Oliver Hunt	5b16941f57	[clang] Ensure correct copying of records with authenticated fields (#136783 ) When records contain fields with pointer authentication, even simple copies can require additional work be performed. This patch contains the core functionality required to handle user defined structs, as well as the implicitly constructed structs for blocks, etc. Co-authored-by: Ahmed Bougacha Co-authored-by: Akira Hatanaka Co-authored-by: John Mccall	2025-04-24 16:22:50 -07:00
Tom Honermann	0348ff5158	[SYCL] Basic code generation for SYCL kernel caller offload entry point functions. (#133030 ) A function declared with the `sycl_kernel_entry_point` attribute, sometimes called a SYCL kernel entry point function, specifies a pattern from which the parameters and body of an offload entry point function, sometimes called a SYCL kernel caller function, are derived. SYCL kernel caller functions are emitted during SYCL device compilation. Their parameters and body are derived from the `SYCLKernelCallStmt` statement and `OutlinedFunctionDecl` declaration associated with their corresponding SYCL kernel entry point function. A distinct SYCL kernel caller function is generated for each SYCL kernel entry point function defined as a non-inline function or ODR-used in the translation unit. The name of each SYCL kernel caller function is parameterized by the SYCL kernel name type specified by the `sycl_kernel_entry_point` attribute attached to the corresponding SYCL kernel entry point function. For the moment, the Itanium ABI mangled name for typeinfo data (`_ZTS<type>`) is used to name these functions; a future change will switch to a more appropriate naming scheme. The calling convention used for a SYCL kernel caller function is target dependent. Support for AMDGCN, NVPTX, and SPIR targets is currently provided. These functions are required to observe the language restrictions for SYCL devices as specified by the SYCL 2020 specification; this includes a forward progress guarantee and prohibits recursion. Only SYCL kernel caller functions, functions declared as `SYCL_EXTERNAL`, and functions directly or indirectly referenced from those functions should be emitted during device compilation. Pruning of other declarations has not yet been implemented. --------- Co-authored-by: Elizabeth Andrews <elizabeth.andrews@intel.com>	2025-04-17 09:14:45 -04:00
Tom Honermann	aca710ac36	[NFC][Clang] Introduce type aliases to replace use of auto in clang/lib/CodeGen/CGCall.cpp. (#135861 ) CGCall.cpp declares several functions with a return type that is an explicitly spelled out specialization of `SmallVector`. Previously, `auto` was used in several places to avoid repeating the long type name; a use that Clang maintainers find unjustified. This change introduces type aliases and replaces the existing uses of `auto` with the corresponding alias name.	2025-04-16 11:28:09 -04:00
Aniket Lal	642481a428	[Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (#115821 ) This feature is currently not supported in the compiler. To facilitate this we emit a stub version of each kernel function body with different name mangling scheme, and replaces the respective kernel call-sites appropriately. Fixes https://github.com/llvm/llvm-project/issues/60313 D120566 was an earlier attempt made to upstream a solution for this issue. --------- Co-authored-by: anikelal <anikelal@amd.com>	2025-04-08 10:29:30 +05:30
Nikita Popov	b384d6d6cc	[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100 ) This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.	2025-04-03 08:04:19 +02:00
Alan Zhao	c5b3fe2094	[clang] Automatically add the `returns_twice` attribute to certain functions even if `-fno-builtin` is set (#133511 ) Certain functions require the `returns_twice` attribute in order to produce correct codegen. However, `-fno-builtin` removes all knowledge of functions that require this attribute, so this PR modifies Clang to add the `returns_twice` attribute even if `-fno-builtin` is set. This behavior is also consistent with what GCC does. It's not (easily) possible to get the builtin information from `Builtins.td` because `-fno-builtin` causes Clang to never initialize any builtins, so functions never get tokenized as functions/builtins that require `returns_twice`. Therefore, the most straightforward solution is to explicitly hard code the function names that require `returns_twice`. Fixes #122840	2025-03-31 09:42:34 -07:00
Kazu Hirata	d3c10a3897	[CodeGen] Use llvm::reverse (NFC) (#133550 )	2025-03-28 19:55:32 -07:00
Liberty	c4ed0ad1f5	[Clang] Fix typo 'dereferencable' to 'dereferenceable' (#116761 ) This patch corrects the typo 'dereferencable' to 'dereferenceable' in CGCall.cpp. The typo is located within a comment inside the `void CodeGenModule::ConstructAttributeList` function.	2025-03-08 19:35:20 +00:00
Brandon Wu	c804e86f55	[RISCV][VLS] Support RISCV VLS calling convention (#100346 ) This patch adds a function attribute `riscv_vls_cc` for RISCV VLS calling convention which takes 0 or 1 argument, the argument is the `ABI_VLEN` which is the `VLEN` for passing the fixed-vector arguments, it wraps the argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the corresponding mechanism to handle it. The range of `ABI_VLEN` is [32, 65536], if not specified, the default value is 128. Here is an example of VLS argument passing: Non-VLS call: ``` void original_call(__attribute__((vector_size(16))) int arg) {} => define void @original_call(i128 noundef %arg) { entry: ... ret void } ``` VLS call: ``` void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {} => define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) { entry: ... ret void } } ``` The first Non-VLS call passes generic vector argument of 16 bytes by flattened integer. On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the vector to <vscale x 1 x i32> where the number of scalable vector elements is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`. Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4. PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418 C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68	2025-03-03 12:39:35 +08:00
Alois Klink	79a28aa0a4	[clang] Ignore GCC 11 [[malloc(x)]] attribute Ignore the `[[malloc(x)]]` or `[[malloc(x, 1)]]` function attribute syntax added in [GCC 11][1] and print a warning instead of an error. Unlike `[[malloc]]` with no arguments (which is supported by Clang), GCC uses the one or two argument form to specify a deallocator for GCC's static analyzer. Code currently compiled with `[[malloc(x)]]` or `__attribute((malloc(x)))` fails with the following error: `'malloc' attribute takes no arguments`. [1]: https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;f=gcc/doc/extend.texi;h=dce6c58db87ebf7f4477bd3126228e73e4eeee97#patch6 Fixes: https://github.com/llvm/llvm-project/issues/51607 Partial-Bug: https://github.com/llvm/llvm-project/issues/53152	2025-02-27 08:06:58 -08:00
Alex Voicu	c8f70d7286	[clang][CodeGen] Additional fixes for #114062 (#128166 ) This addresses two issues introduced by moving indirect args into an explicit AS (please see <https://github.com/llvm/llvm-project/pull/114062#issuecomment-2659829790> and <https://github.com/llvm/llvm-project/pull/114062#issuecomment-2661158477>): 1. Unconditionally stripping casts from a pre-allocated return slot was incorrect / insufficient (this is illustrated by the `amdgcn_sret_ctor.cpp` test); 2. Putting compiler manufactured sret args in a non default AS can lead to a C-cast (surprisingly) requiring an AS cast (this is illustrated by the `sret_cast_with_nonzero_alloca_as.cpp test). The way we handle (2), by subverting CK_BitCast emission iff a sret arg is involved, is quite naff, but I couldn't think of any other way to use a non default indirect AS and make this case work (hopefully this is a failure of imagination on my part).	2025-02-27 09:03:17 +07:00
Alex Voicu	a7a356833d	[NFC][Clang][CodeGen] Remove vestigial assertion (#127528 ) This removes a vestigial assertion, which would erroneously trigger even though we now correctly handle valid arg mismatches (<`2dda529838/clang/lib/CodeGen/CGCall.cpp (L5397)`>), after #114062 went in.	2025-02-17 22:05:22 +02:00
Alex Voicu	39ec9de7c2	[clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (#114062 ) `sret` arguments are always going to reside in the stack/`alloca` address space, which makes the current formulation where their AS is derived from the pointee somewhat quaint. This patch ensures that `sret` ends up pointing to the `alloca` AS in IR function signatures, and also guards agains trying to pass a casted `alloca`d pointer to a `sret` arg, which can happen for most languages, when compiled for targets that have a non-zero `alloca` AS (e.g. AMDGCN) / map `LangAS::default` to a non-zero value (SPIR-V). A target could still choose to do something different here, by e.g. overriding `classifyReturnType` behaviour. In a broader sense, this patch extends non-aliased indirect args to also carry an AS, which leads to changing the `getIndirect()` interface. At the moment we're only using this for (indirect) returns, but it allows for future handling of indirect args themselves. We default to using the AllocaAS as that matches what Clang is currently doing, however if, in the future, a target would opt for e.g. placing indirect returns in some other storage, with another AS, this will require revisiting. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2025-02-14 11:20:45 +00:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Wolfgang Pieb	4424c44c8c	[Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102 ) Following the previous patch which adds the "extend lifetimes" flag without (almost) any functionality, this patch adds the real feature by allowing Clang to emit fake uses. These are emitted as a new form of cleanup, set for variable addresses, which just emits a fake use intrinsic when the variable falls out of scope. The code for achieving this is simple, with most of the logic centered on determining whether to emit a fake use for a given address, and on ensuring that fake uses are ignored in a few cases. Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2025-01-28 12:30:31 +00:00
Kazu Hirata	35f9d2ac49	[CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#122778 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect Prototype.P to be nonnull.	2025-01-13 20:53:13 -08:00
Sander de Smalen	b4ce29ab31	[AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788 ) This adds support for parsing the attribute and codegen to map it to "aarch64_za_state_agnostic" LLVM IR attribute. This attribute is described in the Arm C Language Extensions (ACLE) document: https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic	2025-01-12 21:35:44 +00:00
Timm Baeder	cfe26358e3	Reapply "[clang] Avoid re-evaluating field bitwidth" (#122289 )	2025-01-11 07:12:37 +01:00
Thurston Dang	55b587506e	[ubsan][NFCI] Use SanitizerOrdinal instead of SanitizerMask for EmitCheck (exactly one sanitizer is required) (#122511 ) The `Checked` parameter of `CodeGenFunction::EmitCheck` is of type `ArrayRef<std::pair<llvm::Value *, SanitizerMask>>`, which is overly generalized: SanitizerMask can denote that zero or more sanitizers are enabled, but `EmitCheck` requires that exactly one sanitizer is specified in the SanitizerMask (e.g., `SanitizeTrap.has(Checked[i].second)` enforces that). This patch replaces SanitizerMask with SanitizerOrdinal in the `Checked` parameter of `EmitCheck` and code that transitively relies on it. This should not affect the behavior of UBSan, but it has the advantages that: - the code is clearer: it avoids ambiguity in EmitCheck about what to do if multiple bits are set - specifying the wrong number of sanitizers in `Checked[i].second` will be detected as a compile-time error, rather than a runtime assertion failure Suggested by Vitaly in https://github.com/llvm/llvm-project/pull/122392 as an alternative to adding an explicit runtime assertion that the SanitizerMask contains exactly one sanitizer.	2025-01-10 12:40:57 -08:00
Timm Bäder	59bdea24b0	Revert "[clang] Avoid re-evaluating field bitwidth (#117732 )" This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54. This breaks some LLDB tests, e.g. SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp: lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.	2025-01-08 15:09:52 +01:00
Timm Baeder	81fc3add1e	[clang] Avoid re-evaluating field bitwidth (#117732 ) Save the bitwidth value as a `ConstantExpr` with the value set. Remove the `ASTContext` parameter from `getBitWidthValue()`, so the latter simply returns the value from the `ConstantExpr` instead of constant-evaluating the bitwidth expression every time it is called.	2025-01-08 14:45:19 +01:00
天音あめ	ca5fd06366	[clang] Fix crashes when passing VLA to va_arg (#119563 ) Closes #119360. This bug occurs when passing a VLA to `va_arg`. Since the return value is inferred to be an array, it triggers `ScalarExprEmitter::VisitCastExpr`, which converts it to a pointer and subsequently calls `CodeGenFunction::EmitAggExpr`. At this point, because the inferred type is an `AggExpr` instead of a `ScalarExpr`, `ScalarExprEmitter::VisitVAArgExpr` is not invoked, and as a result, `CodeGenFunction::EmitVariablyModifiedType` is also not called, leading to the size of the VLA not being retrieved. The solution is to move the call to `CodeGenFunction::EmitVariablyModifiedType` into `CodeGenFunction::EmitVAArg`, ensuring that the size of the VLA is correctly obtained regardless of whether the expression is an `AggExpr` or a `ScalarExpr`.	2025-01-07 07:49:43 -05:00
Sameer Sahasrabuddhe	df67e37e37	[clang][NFC] clean up the handling of convergence control tokens (#121738 )	2025-01-06 21:34:11 +05:30
Brandon Wu	8e7f1bee84	[clang][RISCV] Remove unneeded RISCV tuple code (#121024 ) These code are no longer needed because we've modeled tuple type using target extension type rather than structure of scalable vectors.	2024-12-25 22:48:54 +08:00
Pedro Lobo	f28e52274c	[Clang] Change two placeholders from `undef` to `poison` [NFC] (#119141 ) - Use `poison` instead of `undef` as a phi operand for an unreachable path (the predecessor will not go the BB that uses the value of the phi). - Call `@llvm.vector.insert` with a `poison` subvec when performing a `bitcast` from a fixed vector to a scalable vector.	2024-12-10 15:57:55 +00:00
Kazu Hirata	91d6e10cca	[CodeGen] Migrate away from PointerUnion::{is,get} (NFC) (#118600 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-06 01:45:56 -08:00
Sarah Spall	46de3a7064	[HLSL] get inout/out ABI for array parameters working (#111047 ) Get inout/out parameters working for HLSL Arrays. Utilizes the fix from #109323, and corrects the assignment behavior slightly to allow for Non-LValues on the RHS. Closes #106917 --------- Co-authored-by: Chris B <beanz@abolishcrlf.org>	2024-12-03 17:43:36 -08:00
Pedro Lobo	98e747ba56	[clang] Use poison instead of undef as the placeholder when creating a new vector [NFC] (#117064 ) Call `@llvm.vector.insert` with a `poison` vector when coercing a fixed vector to a scalable vector with the same element type.	2024-12-02 09:00:39 +00:00
Benjamin Maxwell	db6f627f3f	[clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (#116391 ) If `__attribute__((flatten))` is used on a function, or `[[clang::always_inline]]` on a statement, don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when these attributes are used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". Similarly, the (clang-only) `[[clang::always_inline]]` statement attribute is more relaxed than the GNU `__attribute__((always_inline))` (which says it should error it if it can't inline), saying only "If a statement is marked [[clang::always_inline]] and contains calls, the compiler attempts to inline those calls.". The docs also go on to show an example of where `[[clang::always_inline]]` has no effect.	2024-11-26 14:26:34 +00:00
Kazu Hirata	e8a6624325	[CodeGen] Remove unused includes (NFC) (#116459 ) Identified with misc-include-cleaner.	2024-11-16 07:37:13 -08:00
joaosaffran	481bce018e	Adding splitdouble HLSL function (#109331 ) - Adding hlsl `splitdouble` intrinsics - Adding DXIL lowering - Adding SPIRV lowering - Adding test Fixes: #108901 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>	2024-10-28 13:26:59 -07:00
Momchil Velikov	53f7f8ecca	[Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747 ) Pure Scalable Types are defined in AAPCS64 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts And should be passed according to Rule C.7 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules This part of the ABI is completely unimplemented in Clang, instead it treats PSTs sometimes as HFAs/HVAs, sometime as general composite types. This patch implements the rules for passing PSTs by employing the `CoerceAndExpand` method and extending it to: * allow array types in the `coerceToType`; Now only `[N x i8]` are considered padding. * allow mismatch between the elements of the `coerceToType` and the elements of the `unpaddedCoerceToType`; AArch64 uses this to map fixed-length vector types to SVE vector types. Corectly passing a PST argument needs a decision in Clang about whether to pass it in memory or registers or, equivalently, whether to use the `Indirect` or `Expand/CoerceAndExpand` method. It was considered relatively harder (or not practically possible) to make that decision in the AArch64 backend. Hence this patch implements the register counting from AAPCS64 (cf. `NSRN`, `NPRN`) to guide the Clang's decision.	2024-10-28 15:43:14 +00:00
Kiran	a96c14eeb8	[Clang] Always forward sret parameters to musttail calls If a call using the musttail attribute returns it's value through an sret argument pointer, we must forward an incoming sret pointer to it, instead of creating a new alloca. This is always possible because the musttail attribute requires the caller and callee to have the same argument and return types.	2024-10-25 09:34:08 +01:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
Jonas Paulsson	14120227a3	Target ABI: improve call parameters extensions handling (#100757 ) For the purpose of verifying proper arguments extensions per the target's ABI, introduce the NoExt attribute that may be used by a target when neither sign- or zeroextension is required (e.g. with a struct in register). The purpose of doing so is to be able to verify that there is always one of these attributes present and by this detecting cases where sign/zero extension is actually missing. As a first step, this patch has the verification step done for the SystemZ backend only, but left off by default until all known issues have been addressed. Other targets/front-ends can now also add NoExt attribute where needed and do this check in the backend.	2024-09-19 16:59:31 +02:00
Chris B	89fb8490a9	[HLSL] Implement output parameter (#101083 ) HLSL output parameters are denoted with the `inout` and `out` keywords in the function declaration. When an argument to an output parameter is constructed a temporary value is constructed for the argument. For `inout` pamameters the argument is initialized via copy-initialization from the argument lvalue expression to the parameter type. For `out` parameters the argument is not initialized before the call. In both cases on return of the function the temporary value is written back to the argument lvalue expression through an implicit assignment binary operator with casting as required. This change introduces a new HLSLOutArgExpr ast node which represents the output argument behavior. The OutArgExpr has three defined children: - An OpaqueValueExpr of the argument lvalue expression. - An OpaqueValueExpr of the copy-initialized parameter. - A BinaryOpExpr assigning the first with the value of the second. Fixes #87526 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: John McCall <rjmccall@gmail.com>	2024-08-31 10:59:08 -05:00
Kiran	c50d11e6d9	Revert "[ARM] musttail fixes" committed by accident, see #104795 This reverts commit a2088a24dad31ebe44c93751db17307fdbe1f0e2.	2024-08-27 11:17:17 +01:00
Kiran	ad468da038	Revert "Seperate frontend changes, add debug directives, remove redundant stuff from tests" This reverts commit 1a908c6be3317bbbac73e6a6fc52cabefbdebf7d.	2024-08-27 10:46:18 +01:00
Kiran	1a908c6be3	Seperate frontend changes, add debug directives, remove redundant stuff from tests	2024-08-27 10:44:06 +01:00
Kiran	a2088a24da	[ARM] musttail fixes Backend: - Caller and callee arguments no longer have to match, just to take up the same space, as they can be changed before the call - Allowed tail calls if callee and callee both (or neither) use sret, wheras before it would be dissalowed if either used sret - Allowed tail calls if byval args are used - Added debug trace for IsEligibleForTailCallOptimisation Frontend (clang): - Do not generate extra alloca if sret is used with musttail, as the space for the sret is allocated already Change-Id: Ic7f246a7eca43c06874922d642d7dc44bdfc98ec	2024-08-27 10:44:06 +01:00
eddyz87	64e464349b	[BPF] introduce __attribute__((bpf_fastcall)) (#105417 ) This commit introduces attribute bpf_fastcall to declare BPF functions that do not clobber some of the caller saved registers (R0-R5). The idea is to generate the code complying with generic BPF ABI, but allow compatible Linux Kernel to remove unnecessary spills and fills of non-scratched registers (given some compiler assistance). For such functions do register allocation as-if caller saved registers are not clobbered, but later wrap the calls with spill and fill patterns that are simple to recognize in kernel. For example for the following C code: #define __bpf_fastcall __attribute__((bpf_fastcall)) void bar(void) __bpf_fastcall; void buz(long i, long j, long k); void foo(long i, long j, long k) { bar(); buz(i, j, k); } First allocate registers as if: foo: call bar # note: no spills for i,j,k (r1,r2,r3) call buz exit And later insert spills fills on the peephole phase: foo: (u64 )(r10 - 8) = r1; # Such call pattern is (u64 )(r10 - 16) = r2; # correct when used with (u64 )(r10 - 24) = r3; # old kernels. call bar r3 = (u64 )(r10 - 24); # But also allows new r2 = (u64 )(r10 - 16); # kernels to recognize the r1 = (u64 )(r10 - 8); # pattern and remove spills/fills. call buz exit The offsets for generated spills/fills are picked as minimal stack offsets for the function. Allocated stack slots are not used for any other purposes, in order to simplify in-kernel analysis.	2024-08-22 03:40:56 +03:00
Vassil Vassilev	6c62ad446b	[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138 ) In incremental compilation clang works with multiple `llvm::Module`s. Our current approach is to create a CodeGenModule entity for every new module request (via StartModule). However, some of the state such as the mangle context needs to be preserved to keep the original semantics in the ever-growing TU. Fixes: llvm/llvm-project#95581. cc: @jeaye	2024-08-21 07:22:31 +02:00
Eduard Zingerman	5e8f4618ce	Revert "[BPF] introduce `__attribute__((bpf_fastcall))` (#101228 )" This reverts commit e9b2e16dc98345bb1b91b1a6dacb3cec85f49e31. Reverting because of the test failure: https://lab.llvm.org/buildbot/#/builders/187/builds/509	2024-08-19 11:30:27 -07:00
eddyz87	e9b2e16dc9	[BPF] introduce `__attribute__((bpf_fastcall))` (#101228 ) This commit introduces attribute bpf_fastcall to declare BPF functions that do not clobber some of the caller saved registers (R0-R5). The idea is to generate the code complying with generic BPF ABI, but allow compatible Linux Kernel to remove unnecessary spills and fills of non-scratched registers (given some compiler assistance). For such functions do register allocation as-if caller saved registers are not clobbered, but later wrap the calls with spill and fill patterns that are simple to recognize in kernel. For example for the following C code: #define __bpf_fastcall __attribute__((bpf_fastcall)) void bar(void) __bpf_fastcall; void buz(long i, long j, long k); void foo(long i, long j, long k) { bar(); buz(i, j, k); } First allocate registers as if: foo: call bar # note: no spills for i,j,k (r1,r2,r3) call buz exit And later insert spills fills on the peephole phase: foo: (u64 )(r10 - 8) = r1; # Such call pattern is (u64 )(r10 - 16) = r2; # correct when used with (u64 )(r10 - 24) = r3; # old kernels. call bar r3 = (u64 )(r10 - 24); # But also allows new r2 = (u64 )(r10 - 16); # kernels to recognize the r1 = (u64 )(r10 - 8); # pattern and remove spills/fills. call buz exit The offsets for generated spills/fills are picked as minimal stack offsets for the function. Allocated stack slots are not used for any other purposes, in order to simplify in-kernel analysis. Corresponding functionality had been merged in Linux Kernel as [this](https://lore.kernel.org/bpf/172179364482.1919.9590705031832457529.git-patchwork-notify@kernel.org/) patch set (the patch assumed that `no_caller_saved_regsiters` attribute would be used by LLVM, naming does not matter for the Kernel).	2024-08-19 19:49:11 +03:00
Daniel Kiss	9e9fa00dcb	[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978 ) Default attributes assigned to all functions according to the command line parameters. Some functions might have their own attributes and we need to set or remove attributes accordingly. Tests are updated to test this scenarios too.	2024-08-09 17:51:38 +02:00

1 2 3 4 5 ...

1235 Commits