llvm-project

Author	SHA1	Message	Date
Thurston Dang	928cad49be	Revert "[ubsan] Connect -fsanitize-skip-hot-cutoff to LowerAllowCheckPass<cutoffs>" (#125032 ) Reverts llvm/llvm-project#124857 due to buildbot breakage (https://lab.llvm.org/buildbot/#/builders/46/builds/11310)	2025-01-29 22:03:05 -08:00
Thurston Dang	dccd271127	[ubsan] Connect -fsanitize-skip-hot-cutoff to LowerAllowCheckPass<cutoffs> (#124857 ) This adds the plumbing between -fsanitize-skip-hot-cutoff (introduced in https://github.com/llvm/llvm-project/pull/121619) and LowerAllowCheckPass<cutoffs> (introduced in https://github.com/llvm/llvm-project/pull/124211). The net effect is that -fsanitize-skip-hot-cutoff now combines the functionality of -ubsan-guard-checks and -lower-allow-check-percentile-cutoff (though this patch does not remove those yet), and generalizes the latter to allow per-sanitizer cutoffs. Note: this patch replaces Intrinsic::allow_ubsan_check's SanitizerHandler parameter with SanitizerOrdinal; this is necessary because the hot cutoffs are specified in terms of SanitizerOrdinal (e.g., null, alignment), not SanitizerHandler (e.g., TypeMismatch). Likewise, CodeGenFunction::EmitCheck is changed to emit allow_ubsan_check() for each individual check. --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com> Co-authored-by: Vitaly Buka <vitalybuka@google.com>	2025-01-29 21:03:26 -08:00
Jason Rice	abc8812df0	[Clang][P1061] Add stuctured binding packs (#121417 ) This is an implementation of P1061 Structure Bindings Introduce a Pack without the ability to use packs outside of templates. There is a couple of ways the AST could have been sliced so let me know what you think. The only part of this change that I am unsure of is the serialization/deserialization stuff. I followed the implementation of other Exprs, but I do not really know how it is tested. Thank you for your time considering this. --------- Co-authored-by: Yanzuo Liu <zwuis@outlook.com>	2025-01-29 21:43:52 +01:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Oliver Stannard	e9c2e0acd7	[AArch64] Match GCC behaviour for zero-size structs (#124760 ) We had a test claiming that this empty struct type consumes a register slot when passing it to a function with GCC, but that does not appear to be the case, at least with GCC versions going back to 4.8. This also caused a miscompilation when passing one of these structs to a variadic function, but it turned out that our implementation of `va_arg` matches GCC's ABI, so the one change fixes both bugs.	2025-01-29 15:02:37 +00:00
nerix	381218950e	[clang-cl]: generate debug info when `novtable` is specified (#124643 ) When no vtable is emitted in the debug info because a record was marked `__declspec(novtable)`, only a forward declaration of that type will be emitted. This PR fixes that by not omitting the definition for the `RecordDecl` in this case. Fixes #124638.	2025-01-28 15:02:33 -08:00
Stephen Tozer	822f74a911	[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767 ) This patch contains a number of changes relating to the above flag; primarily it updates comment references to the old flag names, "-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names, "-fextend-variable-liveness[={all,this}]". These changes are all NFC. This patch also removes the explicit -fextend-this-ptr-liveness flag alias, and shortens the help-text for the main flag; these are both changes that were meant to be applied in the initial PR (#110000), but due to some user-error on my part they were not included in the merged commit.	2025-01-28 18:25:32 +00:00
Joseph Huber	13dcc95dcd	[Offload] Rework offloading entry type to be more generic (#124018 ) Summary: The previous offloading entry type did not fit the current use-cases very well. This widens it and adds a version to prevent further annoyances. It also includes the kind to better sort who's using it. The first 64-bytes are reserved as zero so the OpenMP runtime can detect the old format for binary compatibilitry.	2025-01-28 07:26:13 -06:00
Stephen Tozer	8ad9e1ecb7	[Clang] Fix use of deprecated method and missing triple Fixes two buildbot errors caused by 4424c44c (#110102): The first error, seen on some sanitizer bots: https://lab.llvm.org/buildbot/#/builders/51/builds/9901 The initial commit used the deprecated getDeclaration intrinsic instead of the non-deprecated getOrInsert- equivalent. This patch trivially updates the code in question to use the new intrinsic. The second error, seen on the clang-armv8-quick bot: https://lab.llvm.org/buildbot/#/builders/154/builds/10983 One of the tests depends on a particular triple to get the exact output expected by the test, but did not specify this triple; this patch adds the triple in question.	2025-01-28 13:21:41 +00:00
Wolfgang Pieb	4424c44c8c	[Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102 ) Following the previous patch which adds the "extend lifetimes" flag without (almost) any functionality, this patch adds the real feature by allowing Clang to emit fake uses. These are emitted as a new form of cleanup, set for variable addresses, which just emits a fake use intrinsic when the variable falls out of scope. The code for achieving this is simple, with most of the logic centered on determining whether to emit a fake use for a given address, and on ensuring that fake uses are ignored in a few cases. Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2025-01-28 12:30:31 +00:00
Nikita Popov	1295aa2e81	[Clang] Add -fwrapv-pointer flag (#122486 ) GCC supports three flags related to overflow behavior: * `-fwrapv`: Makes signed integer overflow well-defined. * `-fwrapv-pointer`: Makes pointer overflow well-defined. * `-fno-strict-overflow`: Implies `-fwrapv -fwrapv-pointer`, making both signed integer overflow and pointer overflow well-defined. Clang currently only supports `-fno-strict-overflow` and `-fwrapv`, but not `-fwrapv-pointer`. This PR proposes to introduce `-fwrapv-pointer` and adjust the semantics of `-fwrapv` to match GCC. This allows signed integer overflow and pointer overflow to be controlled independently, while `-fno-strict-overflow` still exists to control both at the same time (and that option is consistent across GCC and Clang).	2025-01-28 09:57:00 +01:00
Adam Yang	aab25f20f6	[HLSL][SPIRV][DXIL] Implement `WaveActiveMax` intrinsic (#123428 ) ``` - add clang builtin to Builtins.td - link builtin in hlsl_intrinsics - add codegen for spirv intrinsic and two directx intrinsics to retain signedness information of the operands in CGBuiltin.cpp - add semantic analysis in SemaHLSL.cpp - add lowering of spirv intrinsic to spirv backend in SPIRVInstructionSelector.cpp - add lowering of directx intrinsics to WaveActiveOp dxil op in DXIL.td - add test cases to illustrate passespendent pr merges. ``` Resolves #99170	2025-01-27 23:26:56 -08:00
Momchil Velikov	f75860f895	[AArch64] Implement NEON FP8 intrinsics for fused multiply-add (#123615 ) This patch adds the following intrinsics: * Fused multiply-add non-indexed float16x8_t vmlalbq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t) float16x8_t vmlaltq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t) float32x4_t vmlallbbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t) float32x4_t vmlallbtq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t) float32x4_t vmlalltbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t) float32x4_t vmlallttq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t) * Floating-point multiply-add long to half-precision (vector, by element) float16x8_t vmlalbq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x8_t vmlalbq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x8_t vmlaltq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x8_t vmlaltq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) * Floating-point multiply-add long-long to single-precision (vector, by element) float32x4_t vmlallbbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlallbbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlallbtq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlallbtq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlalltbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlalltbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlallttq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vmlallttq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)	2025-01-28 00:38:44 +00:00
Momchil Velikov	804b81d39f	[AArch64] Add FP8 Neon intrinsics for dot-product (#123613 ) This patch adds the following intrinsics: float16x4_t vdot_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn, mfloat8x8_t vm, fpm_t fpm) float16x8_t vdotq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x16_t vm, fpm_t fpm) float16x4_t vdot_lane_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x4_t vdot_laneq_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x8_t vdotq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float16x8_t vdotq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x2_t vdot_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn, mfloat8x8_t vm, fpm_t fpm) float32x4_t vdotq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, fpm_t fpm) float32x2_t vdot_lane_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x2_t vdot_laneq_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vdotq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm) float32x4_t vdotq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn, mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)	2025-01-27 21:14:16 +00:00
Jeremy Morse	285009f202	[NFC][DebugInfo] Rewrite more call-sites to insert with iterators (#124288 ) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. The call-sites updated in this patch are those where the dyn_cast_or_null cast utility doesn't compose well with iterator insertion. It can distinguish between nullptr and a "present" (non-null) Instruction pointer, but not between a legal and illegal instruction iterator. This can lead to end-iterator dereferences and thus crashes. We can improve this in the future (as parent-pointers can now be accessed from ilist nodes), but for the moment, add explicit tests for end() iterators at the five call sites affected by this.	2025-01-27 20:30:45 +00:00
cor3ntin	a85b2dc45a	[Clang] only inherit the parent eval context inside of lambdas (#124426 ) As we create defaul constructors lazily, we should not inherit from the parent evaluation context. However, we need to make an exception for lambdas (in particular their conversion operators, which are also implicitly defined). As a drive-by, we introduce a generic way to query whether a function is a member of a lambda. This fixes a regression introduced by baf6bd3. Fixes #118000	2025-01-27 21:30:29 +01:00
Momchil Velikov	99bd2e3f12	[AArch64] Add Neon FP8 conversion intrinsics (#123612 ) The patch adds the following intrinsics: bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm) mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t fpm) Co-Authored-By: Caroline Concatto <caroline.concatto@arm.com>	2025-01-27 17:32:47 +00:00
Jeremy Morse	e14962a39c	[NFC][DebugInfo] Use iterators for instruction insertion in more places (#124291 ) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. This patch changes some more complex call-sites, those crossing file boundaries and where I've had to perform some minor rewrites.	2025-01-27 15:25:17 +00:00
Momchil Velikov	f95a8bde34	[AArch64] Refactor implementation of FP8 types (NFC) (#123604 ) - The FP8 scalar type (`__mfp8`) was described as a vector type - The FP8 vector types were described/assumed to have integer element type (the element type ought to be `__mfp8`) - Add support for `m` type specifier (denoting `__mfp8`) in `DecodeTypeFromStr` and create builtin function prototypes using that specifier, instead of `int8_t`	2025-01-27 14:31:41 +00:00
Momchil Velikov	87103a016f	[AArch64] Implement NEON FP8 vectors as VectorType (#123603 ) Reimplement Neon FP8 vector types using attribute `neon_vector_type` instead of having them as builtin types. This allows to implement FP8 Neon intrinsics without the need to add special cases for these types when using `__builtin_shufflevector` or bitcast (using C-style cast operator) between vectors, both extensively used in the generated code in `arm_neon.h`.	2025-01-27 10:41:53 +00:00
Helena Kotas	d92bac8a3e	[HLSL] Introduce address space `hlsl_constant(2)` for constant buffer declarations (#123411 ) Introduces a new address space `hlsl_constant(2)` for constant buffer declarations. This address space is applied to declarations inside `cbuffer` block. Later on, it will also be applied to `ConstantBuffer<T>` syntax and the default `$Globals` constant buffer. Clang codegen translates constant buffer declarations to global variables and loads from `hlsl_constant(2)` address space. More work coming soon will include addition of metadata that will map these globals to individual constant buffers and enable their transformation to appropriate constant buffer load intrinsics later on in an LLVM pass. Fixes #123406	2025-01-24 16:48:35 -08:00
Jeremy Morse	6292a808b3	[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>	2025-01-24 13:27:56 +00:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Phoebe Wang	ee2722fc88	[X86][AVX10.2-BF16] Remove [NE]P from intrinsic and instruction name (#123335 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2025-01-24 15:49:28 +08:00
Kazu Hirata	113e1fdc8c	[CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#124076 ) Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect Pos to be nonnull.	2025-01-23 08:45:26 -08:00
Finn Plummer	0fe8e70c66	Revert "Reland "[HLSL] Implement the `reflect` HLSL function"" (#124046 ) Reverts llvm/llvm-project#123853 The introduction of `reflect-error.ll` surfaced a bug with the use of `report_fatal_error` in `SPIRVInstructionSelector` that was propagated into the pr. This has caused a build-bot breakage, and the work to solve the underlying issue is tracked here: https://github.com/llvm/llvm-project/issues/124045. We can re-apply this commit when the underlying issue is resolved.	2025-01-22 18:22:03 -08:00
Ben Langmuir	9fbf5cfebc	[clang][modules] Partially revert 48d0eb518 to fix -gmodules output (#124003 ) With the changes in 48d0eb518, the CodeGenOptions used to emit .pcm files with -fmodule-format=obj (-gmodules) were the ones from the original invocation, rather than the ones specifically crafted for outputting the pcm. This was causing the pcm to be written with only the debug info and without the __clangast section in some cases (e.g. -O2). This unforunately was not covered by existing tests, because compiling and loading a module within a single compilation load the ast content from the in-memory module cache rather than reading it from the pcm file that was written. This broke bootstrapping a build of clang with modules enabled on Darwin. rdar://143418834	2025-01-22 16:24:56 -08:00
Tom Honermann	8fb42300a0	[SYCL] AST support for SYCL kernel entry point functions. (#122379 ) A SYCL kernel entry point function is a non-member function or a static member function declared with the `sycl_kernel_entry_point` attribute. Such functions define a pattern for an offload kernel entry point function to be generated to enable execution of a SYCL kernel on a device. A SYCL library implementation orchestrates the invocation of these functions with corresponding SYCL kernel arguments in response to calls to SYCL kernel invocation functions specified by the SYCL 2020 specification. The offload kernel entry point function (sometimes referred to as the SYCL kernel caller function) is generated from the SYCL kernel entry point function by a transformation of the function parameters followed by a transformation of the function body to replace references to the original parameters with references to the transformed ones. Exactly how parameters are transformed will be explained in a future change that implements non-trivial transformations. For now, it suffices to state that a given parameter of the SYCL kernel entry point function may be transformed to multiple parameters of the offload kernel entry point as needed to satisfy offload kernel argument passing requirements. Parameters that are decomposed in this way are reconstituted as local variables in the body of the generated offload kernel entry point function. For example, given the following SYCL kernel entry point function definition: ``` template<typename KernelNameType, typename KernelType> [[clang::sycl_kernel_entry_point(KernelNameType)]] void sycl_kernel_entry_point(KernelType kernel) { kernel(); } ``` and the following call: ``` struct Kernel { int dm1; int dm2; void operator()() const; }; Kernel k; sycl_kernel_entry_point<class kernel_name>(k); ``` the corresponding offload kernel entry point function that is generated might look as follows (assuming `Kernel` is a type that requires decomposition): ``` void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) { Kernel kernel{dm1, dm2}; kernel(); } ``` Other details of the generated offload kernel entry point function, such as its name and calling convention, are implementation details that need not be reflected in the AST and may differ across target devices. For that reason, only the transformation described above is represented in the AST; other details will be filled in during code generation. These transformations are represented using new AST nodes introduced with this change. `OutlinedFunctionDecl` holds a sequence of `ImplicitParamDecl` nodes and a sequence of statement nodes that correspond to the transformed parameters and function body. `SYCLKernelCallStmt` wraps the original function body and associates it with an `OutlinedFunctionDecl` instance. For the example above, the AST generated for the `sycl_kernel_entry_point<kernel_name>` specialization would look as follows: ``` FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)' TemplateArgument type 'kernel_name' TemplateArgument type 'Kernel' ParmVarDecl kernel 'Kernel' SYCLKernelCallStmt CompoundStmt <original statements> OutlinedFunctionDecl ImplicitParamDecl 'dm1' 'int' ImplicitParamDecl 'dm2' 'int' CompoundStmt VarDecl 'kernel' 'Kernel' <initialization of 'kernel' with 'dm1' and 'dm2'> <transformed statements with redirected references of 'kernel'> ``` Any ODR-use of the SYCL kernel entry point function will (with future changes) suffice for the offload kernel entry point to be emitted. An actual call to the SYCL kernel entry point function will result in a call to the function. However, evaluation of a `SYCLKernelCallStmt` statement is a no-op, so such calls will have no effect other than to trigger emission of the offload kernel entry point. Additionally, as a related change inspired by code review feedback, these changes disallow use of the `sycl_kernel_entry_point` attribute with functions defined with a _function-try-block_. The SYCL 2020 specification prohibits the use of C++ exceptions in device functions. Even if exceptions were not prohibited, it is unclear what the semantics would be for an exception that escapes the SYCL kernel entry point function; the boundary between host and device code could be an implicit noexcept boundary that results in program termination if violated, or the exception could perhaps be propagated to host code via the SYCL library. Pending support for C++ exceptions in device code and clear semantics for handling them at the host-device boundary, this change makes use of the `sycl_kernel_entry_point` attribute with a function defined with a _function-try-block_ an error.	2025-01-22 16:39:08 -05:00
Deric Cheung	2656928d0c	Reland "[HLSL] Implement the `reflect` HLSL function" (#123853 ) This PR relands [#122992](https://github.com/llvm/llvm-project/pull/122992). Some machines were failing to run the `reflect-error.ll` test due to the RUN lines ```llvm ; RUN: not %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o /dev/null 2>&1 -filetype=obj %} ; RUN: not %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o /dev/null 2>&1 -filetype=obj %} ``` which failed when `spirv-tools` was not present on the machine due to running the command `not` without any arguments. These RUN lines have been removed since they don't actually test anything new compared to the other two RUN lines due to the expected error during instruction selection. ```llvm ; RUN: not llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o /dev/null 2>&1 \| FileCheck %s ; RUN: not llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o /dev/null 2>&1 \| FileCheck %s ```	2025-01-22 13:29:19 -08:00
Helena Kotas	719f0d9253	[HLSL] Fix global resource initialization (#123394 ) Create separate resource initialization function for each resource and add them to CodeGenModule's `CXXGlobalInits` list. Fixes #120636 and addresses this [comment ](https://github.com/llvm/llvm-project/pull/119755/files#r1894093603).	2025-01-22 12:39:35 -08:00
Andy Kaylor	7bf188fa99	[NFC] Minor fix to tryEmitAbstract type in EmitCXXNewAllocSize (#123433 ) In EmitCXXNewAllocSize, when handling a constant array size, we were calling tryEmitAbstract with the type of the object being allocated rather than the expected type of the array size. This worked out because the allocated type was always a pointer and tryEmitAbstract only ends up using the size of the type to extend or truncate the constant, and in this case the destination type should be size_t, which is usually the same width as the pointer. This change fixes the type, but it makes no functional difference with the current constant emitter implementation.	2025-01-22 10:11:53 -08:00
Thurston Dang	2476417232	Reapply "[sanitizer][NFCI] Add Options parameter to LowerAllowCheckPass" (#122833 ) (#122994 ) This reverts commit 1515caf7a59dc20cb932b724b2ef5c1d1a593427 (https://github.com/llvm/llvm-project/pull/122833) i.e., relands 7d8b4eb0ead277f41ff69525ed807f9f6e227f37 (https://github.com/llvm/llvm-project/pull/122765), with LowerAllowCheckPass::Options moved inside the callback to fix a stack use-after-scope error. --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>	2025-01-22 09:32:12 -08:00
Joseph Huber	70a16b90ff	[HIP] Support managed variables using the new driver (#123437 ) Summary: Previously, managed variables didn't work in rdc mode using the new driver because we just didn't register them. This was previously ignored because we didn't have enough space in the current struct format. This patch amends that by just emitting a struct pair for the two variables and using the single pointer. In the future, a more extensible entry format would be nice, but that can be done later.	2025-01-22 09:13:14 -06:00
Oliver Stannard	c4ef805b0b	[Clang] Re-write codegen for atomic_test_and_set and atomic_clear (#121943 ) Re-write the sema and codegen for the atomic_test_and_set and atomic_clear builtin functions to go via AtomicExpr, like the other atomic builtins do. This simplifies the code, because AtomicExpr already handles things like generating code for to dynamically select the memory ordering, which was duplicated for these builtins. This also fixes a few crash bugs, one when passing an integer to the pointer argument, and one when using an array. This also adds diagnostics for the memory orderings which are not valid for atomic_clear according to https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which were missing before. Fixes https://github.com/llvm/llvm-project/issues/111293. This is a re-land of #120449, modified to allow any non-const pointer type for the first argument.	2025-01-22 10:48:04 +00:00
schittir	65df99c208	[NFC] Avoid potential nullptr deref by using castAs<> (#123395 ) Use castAs<> instead of getAs<>	2025-01-21 21:41:37 -08:00
Finn Plummer	4c91263045	Revert "[HLSL] Implement the `reflect` HLSL function" (#123846 ) Reverts llvm/llvm-project#122992 Due to an included failing test-case the commit causes build failures.	2025-01-21 15:12:58 -08:00
Deric Cheung	dd860bcfb5	[HLSL] Implement the `reflect` HLSL function (#122992 ) Fixes #99152 Tasks completed: - Implement `reflect` in `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Implement the `reflect` SPIR-V target built-in in `clang/include/clang/Basic/BuiltinsSPIRV.td` - Add a SPIR-V fast path in `clang/lib/Headers/hlsl/hlsl_detail.h` in the form ```c++ #if (__has_builtin(__builtin_spirv_reflect)) return __builtin_spirv_reflect(...); #else return ...; // regular behavior #endif ``` - Add codegen for the SPIR-V `reflect` built-in to `EmitSPIRVBuiltinExpr` in `clang/lib/CodeGen/CGBuiltin.cpp` - Add HLSL codegen tests to `clang/test/CodeGenHLSL/builtins/reflect.hlsl` - Add SPIR-V built-in codegen tests to `clang/test/CodeGenSPIRV/Builtins/reflect.c` - Add sema tests to `clang/test/SemaHLSL/BuiltIns/reflect-errors.hlsl` - Add SPIR-V sema tests to `clang/test/CodeGenSPIRV/Builtins/reflect-errors.c` - Create the `int_spv_reflect` intrinsic in `llvm/include/llvm/IR/IntrinsicsSPIRV.td` - In `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` create the `reflect` lowering and map it to `int_spv_reflect` in `SPIRVInstructionSelector::selectIntrinsic` - Create a SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/reflect.ll` Additional tasks completed: - Implement sema check for the `reflect` SPIR-V built-in in `clang/lib/Sema/SemaSPIRV.cpp` - Required for HLSL codegen to work via the SPIR-V fast path, because the types defined in `clang/include/clang/Basic/BuiltinsSPIRV.td` are being overridden - Create SPIR-V backend error test case in `llvm/test/CodeGen/SPIRV/opencl/reflect-error.ll` - Since `reflect` is only available in the GLSL extended instruction set, using it in OpenCL should result in an error Incomplete tasks: - Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/opencl/reflect.ll` - An OpenCL test is not applicable in this case because the [OpenCL SPIR-V extended instruction set](https://registry.khronos.org/SPIR-V/specs/unified1/OpenCL.ExtendedInstructionSet.100.html) does not include a `reflect` function	2025-01-21 14:30:29 -08:00
Shilei Tian	59dffce8c8	[FIX] Include `<numeric>` in `clang/lib/CodeGen/CGExpr.cpp` It uses `std::iota` but the header was not included.	2025-01-21 09:31:35 -05:00
Shilei Tian	03744d2aaf	[Clang] Remove 3-element vector load and store special handling (#104661 ) Clang uses a long-time special handling of the case where 3 element vector loads and stores are performed as 4 element, and then a shufflevector is used to extract the used elements. Odd sized vector codegen should now work reasonably well. This patch removes the compiler argument `-fpreserve-vec3-type` and adds a target hook to determine if the special handling of vector type is needed. --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2025-01-21 09:18:16 -05:00
David Green	6dc356d698	[Clang] Add numeric for iota. Hopefuly fixes MSVC build after 547bfda56b2e3f3a4c6d2357d3566dcd3fa996ad.	2025-01-21 10:36:58 +00:00
Sergey Kozub	616979ebd7	[NVPTX] Add support for PTX 8.6 and CUDA 12.6 (12.8) (#123398 ) Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).	2025-01-21 11:00:24 +01:00
David Green	547bfda56b	[AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (#120363 ) This started out as trying to combine bf16 fpround to BFCVT2 instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in favour of generating fpround instructions directly. This simplifies the patterns and can lead to other optimizations. The BFCVT2 instruction is adjusted to makes sure the types are valid, and a bfcvt2 is now generated in more place. The old intrinsics are auto-upgraded to fptrunc instructions too.	2025-01-21 09:16:04 +00:00
Ulrich Weigand	8424bf207e	[SystemZ] Add support for new cpu architecture - arch15 This patch adds support for the next-generation arch15 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch15 as host processor. - Assembler/disassembler support for new instructions. - Exploitation of new instructions for code generation. - New vector (signed\|unsigned\|bool) __int128 data types. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10305. Note: No currently available Z system supports the arch15 architecture. Once new systems become available, the official system name will be added as supported -march name.	2025-01-20 19:30:21 +01:00
Hervé Poussineau	71d6287f5b	[Clang][MIPS] Create correct linker arguments for Windows toolchains (#121041 )	2025-01-20 15:11:26 +08:00
Michael Buch	a5fb2bbb2a	Reapply "[clang][DebugInfo] Emit DW_AT_object_pointer on function declarations with explicit `this`" (#123455 ) This reverts commit c3a935e3f967f8f22f5db240d145459ee621c1e0. The only change to the reverted commit is that this also updates the OCaml bindings according to the C debug-info API changes. The build failure originally introduced was: ``` FAILED: bindings/ocaml/debuginfo/debuginfo_ocaml.o /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.o cd /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo && /usr/bin/ocamlfind ocamlc -c /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c -ccopt "-I/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/bindings/ocaml/debuginfo/../llvm -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -DEXPENSIVE_CHECKS -D_GLIBCXX_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/b/1/llvm-clang-x86_64-expensive-checks-debian/build/include -I/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/include -DNDEBUG " /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c: In function ‘llvm_dibuild_create_object_pointer_type’: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c:620:30: error: too few arguments to function ‘LLVMDIBuilderCreateObjectPointerType’ 620 \| LLVMMetadataRef Metadata = LLVMDIBuilderCreateObjectPointerType( \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c:23: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/include/llvm-c/DebugInfo.h:880:17: note: declared here 880 \| LLVMMetadataRef LLVMDIBuilderCreateObjectPointerType(LLVMDIBuilderRef Builder, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```	2025-01-18 18:03:41 +00:00
Michał Górny	c3a935e3f9	Revert "[clang][DebugInfo] Emit DW_AT_object_pointer on function declarations with explicit `this`" (#123455 ) Reverts llvm/llvm-project#122928	2025-01-18 07:59:30 +00:00
Michael Buch	10fdd09c3b	[clang][DebugInfo] Emit DW_AT_object_pointer on function declarations with explicit `this` (#122928 ) In https://github.com/llvm/llvm-project/pull/122897 we started attaching `DW_AT_object_pointer` to function definitions. This patch does the same but for function declarations (which we do for implicit object pointers already). Fixes https://github.com/llvm/llvm-project/issues/120974	2025-01-17 19:51:14 +00:00
Farzon Lotfi	eddeb36cf1	[SPIRV] add pre legalization instruction combine (#122839 ) - Add the boilerplate to support instcombine in SPIRV - instcombine length(X-Y) to distance(X,Y) - switch HLSL's distance intrinsic to not special case for SPIRV. - fixes #122766 - This RFC we were requested to add in the infra for pattern matching: https://discourse.llvm.org/t/rfc-add-targetbuiltins-for-spirv-to-support-hlsl/83329/13	2025-01-17 14:46:14 -05:00
Michael Buch	30e276d06d	[clang][PCH] Don't try to create standalone debug-info for types marked nodebug (#123253 ) Fixes one of the crashes uncovered by https://github.com/llvm/llvm-project/pull/118710 `getOrCreateStandaloneType` asserts that a `DIType` was created for the requested type. If the `Decl` was marked `nodebug`, however, we can't generate debug-info for it, so we would previously trigger the assert. For now keep the assertion around and check the `nodebug` at the callsite.	2025-01-17 09:35:02 +00:00
Florian Mayer	a98df67614	[NFC] [BoundsSan] use structured bindings (#123228 ) This slightly simplifies the code.	2025-01-16 14:00:42 -08:00

1 2 3 4 5 ...

17625 Commits