llvm-project

Author	SHA1	Message	Date
Daniel Thornburgh	fecf609998	Reland "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916 )" (#190642 ) This reverts commit 1ec7e86b3a779df2a0af3f37e58c8f5b3a398d7f after issue #190072 was fixed.	2026-04-06 19:20:45 +00:00
Timm Baeder	66483dfe34	[clang][AST][NFC] Add default value to `Expr::isConstantInitializer()` parameter (#190313 ) Almost every caller passes `false` for `ForRef`, or rather, doesn't care what the value is. Use a default value instead.	2026-04-05 06:54:06 +02:00
Eli Friedman	9471fabf8a	[clang] Fix issues with const/pure on varargs function. (#190252 ) There are two related issues here. On the declaration/definition side, we need to make sure the markings are conservative. Then on the caller side, we need to make sure we don't access parameters that don't exist. Fixes #187535.	2026-04-03 13:57:35 -07:00
Florian Hahn	6476619f30	[Matrix] Use matrix element type for TBAA nodes. (#190029 ) Matrix loads and stores are accesses of their element types. Emit TBAA nodes using their element type to allow more precise TBAA alias analysis. PR: https://github.com/llvm/llvm-project/pull/190029	2026-04-03 20:11:04 +00:00
Amr Hesham	2108252f0e	[clang] Fixed a crash when explicitly casting to atomic complex (#172163 ) Fixed a crash when explicitly casting a scalar to an atomic complex. resolve: #114885	2026-04-03 19:28:20 +02:00
theRonShark	00aede8f19	Revert "[Clang][OpenMP] Implement Loop splitting `#pragma omp split` directive " (#190335 ) Reverts llvm/llvm-project#183261 15 new lit tests failing in openmp	2026-04-03 12:27:07 +00:00
Amit Tiwari	1972cf64fd	[Clang][OpenMP] Implement Loop splitting `#pragma omp split` directive (#183261 ) OpenMP 6.0 Loop-splitting directive `#pragma omp split` construct with `counts` clause	2026-04-03 10:42:31 +05:30
Weibo He	bc11c85b6b	[clang][CodeGen] Emit coro.dead intrinsic to improve coroutine allocation elision (#190295 ) Part 4/4: Implement HALO for coroutines that flow off final suspend. Parent PR: #185336	2026-04-03 02:06:10 +00:00
Amr Hesham	f2dff15995	[clang] Fixed a crash when explicitly casting between atomic complex types (#172210 ) Fixed a crash when explicitly casting between atomic complex types resolve: #172208	2026-04-02 22:55:43 +02:00
Steven Perron	6331bfa41a	[HLSL] Add GetDimensions to Texture2D. (#189991 ) This commit add the GetDimensions methods to Texture2D. For DXIL, it requires intrinsics that are not yet available. They are added, but not implemented. Assisted-by: Gemini Co-authored-by: Helena Kotas <hekotas@microsoft.com>	2026-04-02 18:26:02 +00:00
Justin Stitt	43233b8aae	[Clang] Add missing __ob_trap check for sign change (#188340 ) Add a missing OBTrapInvolved check before EmitIntegerSignChangeCheck(). This is considered "missing" as a previous attempt (https://github.com/llvm/llvm-project/pull/185772) to properly add an `__ob_trap` backdoor missed this particular instance. This backdoor is needed because we want `__ob_trap` types to be picky about implicit conversions (including implicit sign change): ```c unsigned int __ob_trap big = 4294967295; (signed int)big; // should trap! ``` Move the `OBTrapInvolved` setup to the top of the function so it can be used in all the places we need it.	2026-04-02 10:51:46 -07:00
PiJoules	bd9e0e8fcf	[clang] Consolidate the relative vtable layout getters (#139315 ) We have 3 different getters to get the vtable component type. This consolidates them into just the one in LangOpts.	2026-04-02 10:33:05 -07:00
Steven Perron	905f23c9f8	[HLSL] Add CalculateLevelOfDetail methods to Texture2D (#188574 ) This adds the CalculateLevelOfDetail and CalculateLevelOfDetailUnclamped methods to Texture2D using the establish pattern used for other methods. Assisted-by: Gemini	2026-04-02 08:58:11 -04:00
Mariya Podchishchaeva	329af7d2b7	[clang] Fix array filler lowering for _BitInt arrays (#189954 ) Sometimes we use array of bytes to represent `_BitInt` types in memory. When this is the case the lowered array filler expression reaches `ConstantEmitter::emitForMemory` already with memory type which will be array of i8 instead of a single iN, so `cast<llvm::ConstantInt>` was failing within `ConstantEmitter::emitForMemory`. This patch fixes the assertion failure by not attempting any type changes if the type is right already. Fixes https://github.com/llvm/llvm-project/issues/189643 Assisted-by: claude in FileCheck CHECK lines fixing	2026-04-02 14:01:45 +02:00
Henrich Lauko	57ee29a2a1	[CIR] Implement isMemcpyEquivalentSpecialMember for trivial copy/move ctors (#186700 ) Implements isMemcpyEquivalentSpecialMember in CIR codegen so that trivial copy/move constructors and defaulted union copy/move ops emit a cir.copy directly instead of making a real constructor call. The logic is shared with OG codegen by moving the implementation into ASTContext, where it also gains the pointer field protection (PFP) check that was previously missing in CIR.	2026-04-02 12:31:53 +02:00
Diego Novillo	06aae40c6d	[HLSL][SPIRV] Restore support for -g to generate NSDI (#190007 ) The original attempt (#187051) produced a regression for `intel-sycl-gpu` because `SPIRVEmitNonSemanticDI` will now self-activate whenever `llvm.dbg.cu` is present. This removed the need for the explicit `--spv-emit-nonsemantic-debug-info` flag. The pass is now entered unconditionally for all SPIR-V targets, but `NonSemantic.Shader.DebugInfo.100` requires the `SPV_KHR_non_semantic_info`. Targets like `spirv64-intel` do not enable that extension by default. When `checkSatisfiable()` ran on those targets, it issued a fatal error rather than silently skipping. Adds an early-out from `emitGlobalDI()`: if `SPV_KHR_non_semantic_info` is not available for the current target, the pass returns without emitting anything.	2026-04-01 21:00:36 -07:00
Aiden Grossman	09c4d8ce0f	[Clang] Fix miscompile with custom operator delete (#190017 ) See discussion in #183347. Added a separate test case rather than reusing destructor-dead-on-return.cpp as we need to test functionality of the deleting destructor which update_cc_test_checks.py does not add check lines for.	2026-04-01 19:25:43 +00:00
Jameson Nash	6c7c575556	[clang] fix OutputSemantic list in HLSL (#185550 ) Normally sane front-ends with the common calling-conventions avoid having multiple sret with a return value, so this is NFCI. However, multiple can be valid. This rewrites an odd looking DenseMap of one element that was needed for iteration into a more sensible vector. Noted in https://github.com/llvm/llvm-project/pull/181740 review.	2026-04-01 14:11:16 -04:00
Mirko Brkušanin	5d9eb0c76a	[AMDGPU] Define new targets gfx1171 and gfx1172 (#187735 )	2026-04-01 18:16:11 +02:00
Matt Arsenault	28efe7b554	OpenMP: Reimplement getOffloadArch (#189561 ) This function made no sense at all. It was scanning through the feature map looking for something that parsed as an OffloadArch. Directly compute the arch from the target device. I don't know why there isn't just an OffloadArch in TargetOpts, this shouldn't really require parsing.	2026-03-31 19:42:15 +02:00
Alexis Engelke	74e84c0cf5	[Clang] Fix getTerminator() use for -fasync-exceptions (#189644 )	2026-03-31 12:50:25 +00:00
Kewen Meng	1ec7e86b3a	Revert "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916 )" This reverts commit 8b21fe60b43fe358321bca904ae307406725c002. to unblock bot: https://lab.llvm.org/buildbot/#/builders/67/builds/1196	2026-03-30 22:25:25 -05:00
Aiden Grossman	12319b373a	[Clang] More aggressively mark this* dead_on_return in destructors (#183347 ) Now also mark the this pointer dead_on_return for classes with a non-zero number of base classes. We saw a limited number of failures internally due to this change, so it doesn't seem like there are too many problems with real world deployment.	2026-03-30 16:34:37 -07:00
Alex Voicu	18e6958903	[SPIRV][AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (#134016 ) This change adds two builtins for AMDGPU: - `__builtin_amdgcn_processor_is`, which is similar in observable behaviour with `__builtin_cpu_is`, except that it is never "evaluated" at run time; - `__builtin_amdgcn_is_invocable`, which is behaviourally similar with `__has_builtin`, except that it is not a macro (i.e. not evaluated at preprocessing time). Neither of these are `constexpr`, even though when compiling for concrete (i.e. `gfxXXX` / `gfxXXX-generic`) targets they get evaluated in Clang, so they shouldn't tear the AST too badly / at all for multi-pass compilation cases like HIP. They can only be used in specific contexts (as args to control structures). The motivation for adding these is two-fold: - as a nice to have, it provides an AST-visible way to incorporate architecture specific code, rather than having to rely on macros and the preprocessor, which burn in the choice quite early; - as a must have, it allows featureful AMDGCN flavoured SPIR-V to be produced, where target specific capability is guarded and chosen or discarded when finalising compilation for a concrete target; this is built atop the Speciali\ation Constant concept which is described in the SPIR-V specification under section [2.12 Specialization](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_specialization_2) I've tried to keep the overall footprint of the change small. The changes to Sema are a bit unpleasant, but there was a strong desire to have Clang validate these, and to constrain their uses, and this was the most compact solution I could come up with (suggestions welcome). --------- Co-authored-by: Juan Manuel Martinez Caamaño <jmartinezcaamao@gmail.com> Co-authored-by: Voicu <avoicu@amd.com>	2026-03-30 23:02:26 +01:00
Daniel Thornburgh	8b21fe60b4	[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916 ) In LTO, part of LLVM's middle-end runs after linking has finished. LTO's semantics depend on the complete set of extracted bitcode files being known at this time. If the middle-end inserts new calls to library functions (libfuncs) that are implemented in bitcode, this could extract new bitcode object files into the link. These cannot be compiled, leading to undefined symbol references. Additionally, the middle-end in LTO may reason that such library functions have no references, and it may internalize them, then manipulate their API or even delete them. Afterwards, it may emit a call to them, again producing undefined symbol references. This patch resolves the former issue by ensuring that the middle end emits no new references to symbols defined in bitcode, and it resolves the latter issue by ensuring that extracted bitcode for libfuncs is considered external, since new calls may be emitted to them at any time. The new semantics are not yet established for MachO LLD, which does not yet appear to have any special handling for libcalls in LTO. It also does not yet support distributed ThinLTO; doing so would require additional (de)serialization work. This is the patch referenced in @ilovepi's and my talk at the last LLVM devmeeting: "LT-Uh-Oh" Gemini 3.1 was used in porting to COFF and WASM LLDs.	2026-03-30 14:44:52 -07:00
Stanislav Mekhanoshin	5f99854d01	[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf16 (#189468 ) Fixes: LCOMPILER-1673	2026-03-30 14:14:22 -07:00
Peter Rong	3e2f0bce95	[ObjCDirectPreconditionThunk] precondition check thunk generation (#170618 ) ## TL;DR This is a stack of PRs implementing features to expose direct methods ABI. You can see the RFC, design, and discussion [here](https://discourse.llvm.org/t/rfc-optimizing-code-size-of-objc-direct-by-exposing-function-symbols-and-moving-nil-checks-to-thunks/88866). https://github.com/llvm/llvm-project/pull/170616 Flag `-fobjc-direct-precondition-thunk` set up https://github.com/llvm/llvm-project/pull/170617 Code refactoring to ease later reviews https://github.com/llvm/llvm-project/pull/170618 Thunk generation https://github.com/llvm/llvm-project/pull/170619 Optimizations, some class objects can be known to be realized ## Implementation details ### Dispatching - `GetDirectMethodCallee` handles the dispatching logic. Previously we only need to call `GenerateDirectMethod` to get the declaration of a direct method. - `GenerateDirectMethod` first attempts to acquire the declaration of the implementation, and return it if the flag is not set. - Generate and return thunk if we can't dispatch to true implementation (i.e. we can't reason receiver is def not null or class object is not realized) ### Precondition check thunk generation - `GenerateObjCDirectThunk` generates the thunk, it is called on demand by `GetDirectMethodCallee` - Thunk inherits all attributes from the true implementation, see `StartObjCDirectThunk` for more detail. - `StartObjCDirectThunk` and `FinishObjCDirectThunk` follows the design pattern of `StartThunk` and `FinishThunk` in CGVTable. ### Precondition check inline generation - If the function need to have precondition check inlined (`shouldHaveNilCheckInline`), caller will emit the nil check during `EmitMessageSend` - Class realization is generated inline - No extra nil check is generated - we reuse `NullReturnState` to emit the nil check for us, it already emits nil check at caller side to handle `ns_consumed`, we just need to tell `NullReturnState` to do the work by setting the flag `RequiresNullCheck \|= ReceiverCanBeNull;` ### Visibility and linkage - Visibility is still by default `Hidden`. But `StartObjCMethod` will now respect source level visibility attributes so methods with `__attribute((visibility("default"))` can be used in other linking units - Linkage is by default `External` ## Tests - `expose-direct-method.m` follow the example of `direct-method.m` - `direct-method-ret-mismatch.m` make sure we can handle the corner case - `expose-direct-method-consumed.m ` and `expose-direct-method-linkedlist.m` executable test on Mac only to validate ARC correctness - `expose-direct-method-varargs.m` - `expose-direct-method-visibility-linkage.m`	2026-03-30 10:32:09 -07:00
Alexis Engelke	7581430722	[IR] Require well-formed IR for BasicBlock::getTerminator (#189416 ) BasicBlock::getTerminator() is frequently called on valid IR, yet the function has to check that the last instruction is in fact a terminator, even in release builds. This check can only be optimized away when the instruction is dereferenced. Therefore, introduce the functions hasTerminator() and getTerminatorOrNull() as replacement and require (assert) that getTerminator() always returns a valid terminator. As a side effect, this forces explicit expression of intent at call sites when unfinished basic blocks should be supported.	2026-03-30 18:57:37 +02:00
Owen Anderson	3f2e24726a	[CHERI] Allow @llvm.clear_cache to accept pointers in address spaces other than 0. (#189283 ) Co-Authored-by: Jessica Clarke <jrtc27@jrtc27.com>	2026-03-30 09:20:49 +02:00
Henrik G. Olsson	bc12c38af9	[Clang] remove redundant uses of dyn_cast (NFC) (#189106 ) This removes dyn_cast invocations where the argument is already of the target type (including through subtyping). This was created by adding a static assert in dyn_cast and letting an LLM iterate until the code base compiled. I then went through each example and cleaned it up. This does not commit the static assert in dyn_cast, because it would prevent a lot of uses in templated code. To prevent backsliding we should instead add an LLVM aware version of https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html (or expand the existing one).	2026-03-27 17:11:23 -07:00
Kamran Yousafzai	1264ffc4cc	[clang][RISC-V] fixed fp calling convention for fpcc eligible structs for risc-v (#110690 ) The code generated for calls with FPCC eligible structs as arguments doesn't consider the bitfield, which results in a store crossing the boundary of the memory allocated using alloca, e.g. For the code: ``` struct __attribute__((packed, aligned(1))) S { const float f0; unsigned f1 : 1; }; unsigned func(struct S arg) { return arg.f1; } ``` The generated IR is: ``` define dso_local signext i32 @func( float [[TMP0:%.]], i32 [[TMP1:%.]]) #[[ATTR0:[0-9]+]] { [[ENTRY:.:]] [[ARG:%.]] = alloca [[STRUCT_S:%.]], align 1 [[TMP2:%.]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 0 store float [[TMP0]], ptr [[TMP2]], align 1 [[TMP3:%.]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 1 store i32 [[TMP1]], ptr [[TMP3]], align 1 [[F1:%.]] = getelementptr inbounds nuw [[STRUCT_S]], ptr [[ARG]], i32 0, i32 1 [[BF_LOAD:%.]] = load i8, ptr [[F1]], align 1 [[BF_CLEAR:%.]] = and i8 [[BF_LOAD]], 1 [[BF_CAST:%.]] = zext i8 [[BF_CLEAR]] to i32 ret i32 [[BF_CAST]] ``` Where, `store i32 [[TMP1]], ptr [[TMP3]], align 1` can be seen crossing the boundary of the allocated memory. If, the IR is seen after optimizations (EarlyCSEPass), the IR left is: ``` define dso_local noundef signext i32 @func( float [[TMP0:%.]], i32 [[TMP1:%.]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { [[ENTRY:.:]] ret i32 0 ``` The patch trims the second member of the struct after taking into consideration the bitwidth to decide the appropriate integer type and the test shows the results of this patch. Note that the bug is seen only when `f` extension is enabled for FPCC eligibility. Co-authored-by: muhammad.kamran4 <muhammad.kamran@esperantotech.com>	2026-03-27 16:55:11 -07:00
Stanislav Mekhanoshin	a2d84b5d8d	[AMDGPU] Remove neg support from 4 more gfx1250 WMMA (#189115 ) These are previously covered by AMDGPUWmmaIntrinsicModsAllReuse.	2026-03-27 15:20:14 -07:00
Justin Stitt	f9ad232421	[Clang] Show inlining hints for __attribute__((warning/error)) (#174892 ) When functions marked with `[[gnu::warning/error]]` are called through inlined functions, we now show the inlining chain that led to the call when ``-fdiagnostics-show-inlining-chain`` is enabled. With this flag, two modes are possible: - heuristic mode: Uses `srcloc` and `inlined.from` metadata to reconstruct the inlining chain. Functions that are `inline`, `static`, `always_inline`, or in anonymous namespaces get `srcloc` metadata attached. This mode emits a note suggesting `-gline-directives-only` for more accurate locations. - debug mode: Automatically used instead of heuristic when building with at least `-gline-directives-only` (implied by `-g1` or higher). Leverages `DILocation` debug info for reliable source locations. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1571	2026-03-27 13:52:31 -07:00
Justin Stitt	8b395a7755	[Clang] Ensure pattern exclusion priority over OBT (#188390 ) Make sure pattern exclusions have priority over the overflow behavior types when deciding whether or not to emit truncation checks. Accomplish this by carrying an extra field through `ScalarConversionOpts` which we later check before emitting instrumentation.	2026-03-27 13:51:21 -07:00
Takashi Idobe	cbe9891b44	[Clang] fix bad codegen from constexpr structured bindings (#186594 ) Resolves: https://github.com/llvm/llvm-project/issues/164150 C++26 allows for constexpr packs in structured bindings. This is a new feature (the code doesn't compile on lower the -std=c++26) and so was previously unhandled in clang. This makes clang aware of packs and handle them as one constant unit instead of materializing them as separate mutable reference temporaries allowing llvm to optimize them. This turns the example code from the issue into this as you would expect without compiling for zen 5 (the good codegen described). ```asm movq %rdi, %rax movups (%rsi), %xmm0 movups %xmm0, (%rdi) movups (%rdx), %xmm0 movups %xmm0, 16(%rdi) retq ```	2026-03-27 12:07:23 +08:00
Nick Sarnie	09951fd475	Revert "[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info" (#188771 ) Reverts llvm/llvm-project#187051 Breaks some OpenMP offload tests	2026-03-26 18:58:47 +00:00
Sietze Riemersma	593f82ab9d	[HLSL][DXIL][SPRIV] Added `GroupMemoryBarrier()` (#185383 ) Adds the `GroupMemoryBarrier()` HLSL function to SPIRV and DirectX with additional tests for the different backends. When this moves in, will create another PR with this as a template for the other Barriers: - `AllMemoryBarrier()` #99076 - `AllMemoryBarrierWithGroupSync()` #99090 - `DeviceMemoryBarrier()` #99105 - `DeviceMemoryBarrierWithGroupSync()` #99106 `Barrier()` does not have support for SPIRV, so I will exclude that from the next PR. - [x] Implement GroupMemoryBarrier clang builtin, - [x] Link GroupMemoryBarrier clang builtin with hlsl_intrinsics.h - [x] Add sema checks for GroupMemoryBarrier to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp - [x] Add codegen for GroupMemoryBarrier to EmitHLSLBuiltinExpr in CGBuiltin.cpp - [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/GroupMemoryBarrier.hlsl - [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/GroupMemoryBarrier-errors.hlsl - [x] Create the int_dx_GroupMemoryBarrier intrinsic in IntrinsicsDirectX.td - [x] Create the DXILOpMapping of int_dx_GroupMemoryBarrier to 80 in DXIL.td - [x] Create the GroupMemoryBarrier.ll and GroupMemoryBarrier_errors.ll tests in llvm/test/CodeGen/DirectX/ - [x] Create the int_spv_GroupMemoryBarrier intrinsic in IntrinsicsSPIRV.td - [x] In SPIRVInstructionSelector.cpp create the GroupMemoryBarrier lowering and map it to int_spv_GroupMemoryBarrier in SPIRVInstructionSelector::selectIntrinsic. - [x] Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/GroupMemoryBarrier.ll <!-- branch-stack-start --> <!-- branch-stack-end -->	2026-03-26 13:24:27 -04:00
Ivan R. Ivanov	19420c0e77	[OpenMP] Fix non-contiguous array omp target update (#156889 ) The existing implementation has three issues which this patch addresses. 1. The last dimension which represents the bytes in the type, has the wrong stride and count. For example, for a 4 byte int, count=1 and stride=4. The correct representation here is count=4 and stride=1 because there are 4 bytes (count=4) that we need to copy and we do not skip any bytes (stride=1). 2. The size of the data copy was computed using the last dimension. However, this is incorrect in cases where some of the final dimensions get merged into one. In this case we need to take the combined size of the merged dimensions, which is (Count * Stride) of the first merged dimension. 3. The Offset into a dimension was computed as a multiple of its Stride. However, this Stride which is in bytes, already includes the stride multiplier given by the user. This means that when the user specified 1:3:2, i.e. elements 1, 3, 5, the runtime incorrectly copied elements 2, 4, 6. Fix this by precomputing at compile time the Offset to be in bytes by correctly multiplying the offset by the stride of the dimension without the user-specified multiplier.	2026-03-26 15:55:31 +01:00
Dmitry Sidorov	82d0173f72	[HIP][CUDA] Apply protected visibility to kernels and globals (#187784 ) Add the visibility override in setGlobalVisibility(), following the existing OpenMP precedent. Unlike the AMDGPU post-hoc override, this check respects explicit [[gnu::visibility("hidden")]] attributes via isVisibilityExplicit().	2026-03-26 13:57:41 +00:00
Stanislav Mekhanoshin	e69c7312f3	[AMDGPU] Disable neg_lo[0:1] and neg_hi[0:1] on wmma_f32_16x16x32_bf16 (#188649 ) This is the pilot change, the rest will follow the same idea.	2026-03-26 00:37:05 -07:00
Akira Hatanaka	154d2267b8	[ObjC] Emit number, array, and dictionary literals as constants (#185130 ) When targeting runtimes that support constant literal classes, emit ObjC literal expressions @(number), @[], and @{} as compile-time constant data structures rather than runtime msgSend calls. This reduces code size and runtime overhead at the cost of increased data segment size, and avoids repeated heap allocation of equivalent literal objects. The feature is not supported with the fragile ABI or GNU runtimes, where it is automatically disabled. The feature can be disabled altogether with -fno-objc-constant-literals, or individually per literal kind: -fno-constant-nsnumber-literals -fno-constant-nsarray-literals -fno-constant-nsdictionary-literals Custom backing class names can be specified via: -fconstant-array-class=<name> -fconstant-dictionary-class=<name> -fconstant-integer-number-class=<name> -fconstant-float-number-class=<name> -fconstant-double-number-class=<name> rdar://45380392 rdar://168106035 --------- Co-authored-by: Ben D. Jones <bendjones@apple.com>	2026-03-25 15:03:14 -07:00
Kai	a546c77478	[HLSL][DXIL][SPIRV] QuadReadAcrossY intrinsic support (#187440 ) This PR adds QuadReadAcrossY intrinsic support in HLSL with codegen for both DirectX and SPIRV backends. Resolves https://github.com/llvm/llvm-project/issues/99176. - [x] Implement `QuadReadAcrossY` clang builtin, - [x] Link `QuadReadAcrossY` clang builtin with `hlsl_intrinsics.h` - [x] Add sema checks for `QuadReadAcrossY` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` - [x] Add codegen for `QuadReadAcrossY` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` - [x] Add codegen tests to `clang/test/CodeGenHLSL/builtins/QuadReadAcrossY.hlsl` - [x] Add sema tests to `clang/test/SemaHLSL/BuiltIns/QuadReadAcrossY-errors.hlsl` - [x] Create the `int_dx_QuadReadAcrossY` intrinsic in `IntrinsicsDirectX.td` - [x] Create the `DXILOpMapping` of `int_dx_QuadReadAcrossY` to `123` in `DXIL.td` - [x] Create the `QuadReadAcrossY.ll` and `QuadReadAcrossY_errors.ll` tests in `llvm/test/CodeGen/DirectX/` - [x] Create the `int_spv_QuadReadAcrossY` intrinsic in `IntrinsicsSPIRV.td` - [x] In SPIRVInstructionSelector.cpp create the `QuadReadAcrossY` lowering and map it to `int_spv_QuadReadAcrossY` in `SPIRVInstructionSelector::selectIntrinsic`. - [x] Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/QuadReadAcrossY.ll`	2026-03-25 11:32:35 -07:00
Diego Novillo	85049fc357	[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info (#187051 ) This adds two related changes to HLSL debug info support in the SPIR-V backend. It's a first small step towards the plan I described in https://discourse.llvm.org/t/hlsl-spirv-nsdi-debug-info-support-for-clang-dxc/90149. ## Tag HLSL shaders with `DW_LANG_HLSL` in the front-end `GetSourceLanguage()` in `clang/lib/CodeGen/CGDebugInfo.cpp` checked `LO.CPlusPlus` before `LO.HLSL`. Since HLSL is compiled as C++, the HLSL check was never reached. Shaders compiled with `-g` were tagged with `DW_LANG_C_plus_plus_14` instead of `DW_LANG_HLSL`. The NSDI pass already had the correct mapping for `DW_LANG_HLSL` but it was never triggered. This fixes #136929 and #136995. ## Make `SPIRVEmitNonSemanticDI` activate automatically when `-g` is used `SPIRVPassConfig::addPreEmitPass()` only scheduled `SPIRVEmitNonSemanticDI` when `--spv-emit-nonsemantic-debug-info` was set or the target vendor was AMD. Passing `-g` to clang had no effect on the SPIR-V backend pass. The pass is now added unconditionally and self-activates by checking for `llvm.dbg.cu` in the module. When no debug metadata is present it exits early with no effect. This avoids the need to inspect module metadata at pass-configuration time, which is not reliably available. `--spv-emit-nonsemantic-debug-info` is now a deprecated synonym for `-g`. The alternative to the unconditional pass approach is to check at pass-configuration time whether the module was compiled with debug info (e.g. via `TargetOptions::DebugInfoForProfiling` or a similar flag forwarded from the driver). I went with the unconditional approach because it is simpler and the pass is cheap to enter and exit when no `llvm.dbg.cu` is present. I'm not sure whether adding a pass unconditionally is acceptable. Does this sound reasonable, or would it be better to implement the flag-forwarding approach? Changes to tests: - `clang/test/CodeGenHLSL/` (new): verifies that `-g` on an HLSL SPIR-V target produces `DebugCompilationUnit` with language code 5 (`DW_LANG_HLSL`). - `llvm/test/CodeGen/SPIRV/debug-info/hlsl-debug-info-auto-activation.ll` (new): verifies that a module with `llvm.dbg.cu` and `DW_LANG_HLSL` produces `DebugCompilationUnit` without `--spv-emit-nonsemantic-debug-info`. - Existing `debug-compilation-unit.ll`, `debug-type-basic.ll`, `debug-type-pointer.ll`: updated to verify NSDI is emitted whenever debug metadata is present. - `llc-pipeline.ll`: updated to reflect that `SPIRVEmitNonSemanticDI` is now always in the pipeline. --------- Co-authored-by: Eric Christopher <echristo@gmail.com>	2026-03-25 11:17:09 -07:00
Jonas Devlieghere	4fe46edf8d	[lldb] Support arm64e C++ vtable pointer signing (#187611 ) When targeting arm64e, vtable pointers are signed with a discriminator that incorporates the object's address (PointerAuthVTPtrAddressDiscrimination) and class type (PointerAuthVTPtrTypeDiscrimination). I had to make a small change to clang, specifically in getPointerAuthDeclDiscriminator(). Previously, that was computing the discriminator based on getMangledName(). The latter returns the AsmLabelAttr, which for functions imported by lldb, is prefixed with `$__lldb_func`, causing a different discriminator to be generated.	2026-03-25 12:08:02 -05:00
Gang Chen	32720fba18	[clang] Pragma for llvm.loop.licm.disable (#188108 ) llvm.loop.licm.disable is already availabe at LLVM-IR level to disable LICM per loop. This PR simply exposes that capability to the developers at clang level.	2026-03-25 09:12:21 -07:00
Nathan Gauër	18250fd47b	[HLSL][SPIR-V] Add vk::ext_builtin_output attribute (#188268 ) This attribute is similar to the already implemented ext_builtin_input attribute. One important bit is the `static` storage class: HLSL uses static differently than C/C++. This is a known weirdness: See https://github.com/microsoft/hlsl-specs/issues/350 In C/C++, when we declare a variable as 'extern', we often expect another module to declare the symbole. In HLSL, the pipeline will 'declare' the symbol. Hence in this case, we need to emit the global variable. Related WG-HLSL: https://github.com/llvm/wg-hlsl/blob/main/proposals/0031-semantics.md --------- Co-authored-by: Steven Perron <stevenperron@google.com>	2026-03-25 16:07:35 +00:00
Owen Anderson	ca9ac0e24a	[CHERI] Allow @llvm.returnaddress to return a pointer in any address space. (#188464 ) Clang now constructs calls to it using the default program address space from the DataLayout. Co-authored-by: Alex Richardson <alexrichardson@google.com>	2026-03-25 13:59:38 +00:00
Craig Topper	6c6b4c154c	[RISCV] Disable rounding of aggregate return/arguments to iXLen. (#184736 ) If the type is rounded to iXLen, an additional zext instruction is generated. For example, https://godbolt.org/z/bG7vG4dvM	2026-03-23 19:39:21 -07:00
Joshua Batista	54a3518fc3	[HLSL] Add WaveActiveBitAnd builtin function (#187149 ) This PR adds the WaveActiveBitAnd HLSL function. Fixes https://github.com/llvm/llvm-project/issues/99166	2026-03-23 10:19:18 -07:00
Alex MacLean	a9775221ae	[NVPTX] Canonicalize NVVM attribute strings and refactor property queries (NFC) (#187752 )	2026-03-23 08:07:09 -07:00

1 2 3 4 5 ...

18859 Commits