llvm-project

Author	SHA1	Message	Date
Jean-Didier PAILLEUX	0625467c63	[Flang] Fix lowering failure for some constructs inside a CHANGE TEAM (#184342 ) This PR is here to fix the `CHANGE_TEAM` construct if it contains an IF/ELSE (construct with a body too) in its body, for example.	2026-04-02 15:54:13 +02:00
Dhruv Chauhan	b87be02cc7	Revert "[mlir][tensor] Forward concat insert_slice destination into DPS provider" (#190143 ) This reverts commit 1418f80. The change can cause an infinite rewrite loop when ForwardConcatInsertSliceDest interacts with FoldEmptyTensorWithExtractSliceOp.	2026-04-02 14:48:44 +01:00
Alexey Bataev	c2f97c5917	[SLP] Do not skip tiny trees with gathered loads to vectorize The isTreeTinyAndNotFullyVectorizable check for 2-node trees (insertelement root + gather child) was too aggressive: it rejected trees even when LoadEntriesToVectorize was non-empty, preventing gathered loads from being vectorized into masked loads/strided loads, etc. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/190181	2026-04-02 09:47:01 -04:00
Charles Zablit	aff601aed8	[lldb][windows] fix duplicate OnLoadModule events (#189376 )	2026-04-02 14:46:00 +01:00
Stefan Gränitz	60efb5c0a3	[llvm] Fix SupportHTTP linkage with libLLVM in unit-tests (#190097 ) Since libSupportHTTP is part of the LLVM dylib, we must link it as a component now. Fixes https://github.com/llvm/llvm-project/issues/189978	2026-04-02 15:37:08 +02:00
Ryotaro Kasuga	682a217d74	[DA] Extract the logic shared by the Exact SIV/RDIV test (#189951 ) The Exact SIV test and the Exact RDIV test behave almost identically, except that the Exact SIV test also explores the directions in the final step. This patch consolidates the two duplicate implementations into a single function that can be used by both tests. While this change slightly affects things like debug output and metrics, it is not intended to alter the actual test results.	2026-04-02 13:30:42 +00:00
Alexey Bataev	dc2d25f80b	Revert "[SLP] Do not skip tiny trees with gathered loads to vectorize" This reverts commit 94ec7ffa46d351b86fbbe3a445ceef37f331c4a2 to fix reported issue https://github.com/llvm/llvm-project/pull/190040#issuecomment-4177827078 Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/190176	2026-04-02 09:26:31 -04:00
Rahul Joshi	99786f20ee	[LLVM][Intrinsics] Refactor `IITDescriptor` (#190011 ) The main change is to eliminate the use of "Argument" terminology when dealing with overloaded types since overloaded types can be either argument or return values, and some additional renaming for clarity. 1. Rename `Tys` argument to various intrinsic APIs to `OverloadTys` to better reflect its meaning. 2. Rename `IITDescriptorKind::Argument` to `IITDescriptorKind::Overloaded` to better convey that it's an overloaded type. Removed "Argument" suffix for other kinds for dependent types. 3. Rename `ArgKind` to `AnyKind`, `getArgumentNumber` to `getOverloadIndex`, `getArgumentKind` to `getOverloadKind`, `getRefArgNumber` to `getRefOverloadIndex`, and `IIT_ARG` to `IIT_ANY`. 4. Rename `IIT_ANYPTR` (used to represent a pointer qualified with address space) to `IIT_PTR_AS` to clearly distinguish it from `llvm_anyptr_ty` 5. Change the packing of [ref overload index & overload index] for `VecOfAnyPtrsToElt` to pack the overload index into the lower bits, so we can use the `getOverloadIndex` function to get the overload index.	2026-04-02 06:19:01 -07:00
Matt Arsenault	862ceaa793	clang/AMDGPU: Use f64 exp10 builtin in hip math headers (#185947 )	2026-04-02 15:04:12 +02:00
Erich Keane	58b719660c	[CIR] Implement union aggregate init (#190057 ) This ends up being a pretty trivial amount of work, since we just have to forward the initialization for a union on to the 'active' field, which this patch does.	2026-04-02 06:00:36 -07:00
Erich Keane	710d647586	[CIR] Implement 'null' function-pointer vtable entries (#190013 ) This functionality is described in the Itanium C++ABI 2.5.2 (and is also where the test comes from). See also VTableBuilder.cpp's documentation on the declaration of IsOverriderUsed for further details. However, the explaination is: When B and C are declared, A is a primary base in each case, so although vcall offsets are allocated in the A-in-B and A-in-C vtables, no this adjustment is required and no thunk is generated. However, inside D objects, A is no longer a primary base of C, so if we allowed calls to C::f() to use the copy of A's vtable in the C subobject, we would need to adjust this from C* to B::A, which would require a third-party thunk. Since we require that a call to C::f() first convert to A, C-in-D's copy of A's vtable is never referenced, so this is not necessary. The short of that is: there is no way to call these, so we just emit a nullptr rather than the required thunk.	2026-04-02 06:00:23 -07:00
Steven Perron	905f23c9f8	[HLSL] Add CalculateLevelOfDetail methods to Texture2D (#188574 ) This adds the CalculateLevelOfDetail and CalculateLevelOfDetailUnclamped methods to Texture2D using the establish pattern used for other methods. Assisted-by: Gemini	2026-04-02 08:58:11 -04:00
Ramkumar Ramachandra	b0230f5996	[VPlan] Cleanup and generalize VPPhiAccessors CastInfo (NFC) (#190027 )	2026-04-02 13:47:44 +01:00
Balázs Benics	3200d64c33	[clang][ssaf] Fix nondeterministic test failures of clang/test/Analysis/Scalable/call-graph.cpp (#190155 ) fixup! [clang][ssaf] Implement JSON format for CallGraph summary The order of checks were unspecified in the test. Consequently, "polymorphic" may have appeared before "caller" failing the test. This patch splits the test into separate files so these wouldn't interfere. Example bot failures: https://lab.llvm.org/buildbot/#/builders/225/builds/4974 https://lab.llvm.org/buildbot/#/builders/144/builds/50732 https://lab.llvm.org/buildbot/#/builders/190/builds/39773 https://lab.llvm.org/buildbot/#/builders/46/builds/33213 Fixes up #189681	2026-04-02 13:11:16 +01:00
Mariya Podchishchaeva	329af7d2b7	[clang] Fix array filler lowering for _BitInt arrays (#189954 ) Sometimes we use array of bytes to represent `_BitInt` types in memory. When this is the case the lowered array filler expression reaches `ConstantEmitter::emitForMemory` already with memory type which will be array of i8 instead of a single iN, so `cast<llvm::ConstantInt>` was failing within `ConstantEmitter::emitForMemory`. This patch fixes the assertion failure by not attempting any type changes if the type is right already. Fixes https://github.com/llvm/llvm-project/issues/189643 Assisted-by: claude in FileCheck CHECK lines fixing	2026-04-02 14:01:45 +02:00
dibrinsofor	eaa3ef9ddc	[DAG] Propagate OrZero and DemandedElts for min/max in isKnownToBeAPowerOfTwo (#182369 ) Fixes #181643 For queries like `isKnownToBeAPowerOfTwo(V, OrZero=true)`, if an operand is known to be "pow2-or-zero" but not strictly non-zero power-of-two, the min/max case currently returns false even when the result remains pow2-or-zero. For instance: - `A = select cond, 4, 0` (A is pow2-or-zero) - `R = umin(A, 16)` `R` is always in `{0, 4}` and querying `isKnownToBeAPowerOfTwo(R, OrZero=true)` should be true. Added unitests for baseline and failing case and now propagating correctly to `OrZero` and `DemandedElts`	2026-04-02 12:50:11 +01:00
Simon Pilgrim	7410a81fbd	[X86] LowerShiftByScalarImmediate - vXi8 shl(X,2) - prefer PADDB+PADDB pair over PSLLW+PAND (#186095 ) For all targets, (V)PADDB is always as fast as (V)PSLLW (usually faster) - and usually as fast as (V)PAND, and avoids having to load a mask - so for shift lefts by 2, a pair of (V)PADDB is a better choice vs (V)PSLLW+(V)PAND This is only necessary if we're avoiding a (V)PAND mask - otherwise we just need a single (V)PSLLW.	2026-04-02 12:48:18 +01:00
Balázs Benics	803d1d6609	[clang][ssaf] Implement JSON format for CallGraph summary (#189681 ) rdar://170258016	2026-04-02 12:14:46 +01:00
Kseniya Tikhomirova	c4b0f9959a	[libsycl] Add device image registration & compatibility check (#187528 ) This is part of the SYCL support upstreaming effort. The relevant RFCs can be found here: https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080 https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479 --------- Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>	2026-04-02 13:06:44 +02:00
Nikita Popov	72e8c9b78f	[CodeGen] Move llc-start-stop.ll test to X86 (#190151 ) The pass pipeline differs across targets, so make this test use one specific pipeline, instead of trying to cater to cross-target differences. Those differences are not relevant to the intent of the test.	2026-04-02 13:04:29 +02:00
Alexandros Lamprineas	64b728128d	[BOLT][AArch64] Add minimal support for liveness analysis. (#183298 ) In this patch I am adding the missing target hooks required for the liveness analysis to run on AArch64. These are - getFlagsReg() - getRegsUsedAsParams() - getDefaultLiveOut() - getGPRegs() - isCleanRegXOR() I am also introducing the following API in LivenessAnalysis - BitVector getLiveIn/Out(const MCInst &) - MCPhysReg scavengeRegFromState(BitVector &) My intention is to allow the LongJmp pass scavenge usable registers when injecting code.	2026-04-02 11:59:59 +01:00
Arseniy Zaostrovnykh	3468ee025e	fixup! [publish-sphinx-docs] Forward CTU-import failure conditions Fix the Sphinx build failure https://lab.llvm.org/buildbot/#/builders/45/builds/24533 introduced in e3cfcf48d0c966f163c9807839a900936af0e759 (PR #189064) by using a dedicated group for CTU remarks.	2026-04-02 10:52:46 +00:00
Michael Kruse	bed2761bc0	[Polly] Print params with stmt tracing (#189362 ) It was helpful for #189350.	2026-04-02 10:51:39 +00:00
Alexey Bataev	94ec7ffa46	[SLP] Do not skip tiny trees with gathered loads to vectorize The isTreeTinyAndNotFullyVectorizable check for 2-node trees (insertelement root + gather child) was too aggressive: it rejected trees even when LoadEntriesToVectorize was non-empty, preventing gathered loads from being vectorized into masked loads/strided loads, etc. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/190040	2026-04-02 06:47:53 -04:00
Hristo Hristov	5613f77293	[libc++][ranges] Updated `ranges::to` `[[nodiscard]]` tests (#173574 ) - Updated and moved `ranges::to` `[[nodiscard]]` tests to the new conventinional location. - Removed libcxx/test/libcxx/diagnostics/ranges.nodiscard.verify.cpp, which is now obsolete in favor of more specific test files. # References: - https://wg21.link/range.utility.conv - https://wg21.link/range.utility.conv.to - https://wg21.link/range.utility.conv.adaptors Towards #172124	2026-04-02 13:46:06 +03:00
Naveen Seth Hanig	0ef3906aea	[clang][modules-driver] Avoid copy of -cc1 command & const correctness (NFC) (#190142 )	2026-04-02 10:44:50 +00:00
Valeriy Savchenko	f51e343ed5	[AArch64] Select REV16 for zext(bswap(i16)) (#189576 ) Extend the existing any_extend(bswap i16) -> rev16 combine to also handle zero_extend. REV16 preserves a zero upper half, so for i16 loads this saves one instruction: ldrh+rev+lsr#16 -> ldrh+rev16.	2026-04-02 11:32:39 +01:00
Michael Kruse	afb80bddf1	[Runtimes] Introduce variables containing resource dir paths (#177953 ) Introduce common infrastructure for runtimes that determines compiler resource path locations. These variables introduced are: * RUNTIMES_OUTPUT_RESOURCE_DIR * RUNTIMES_INSTALL_RESOURCE_PATH That contain the location for the compiler resource path (typically `lib/clang/<version>`) in the build tree and the install tree (the latter relative to CMAKE_INSTALL_PREFIX). Additionally, define * RUNTIMES_OUTPUT_RESOURCE_LIB_DIR * RUNTIMES_INSTALL_RESOURCE_LIB_PATH as for the location of clang/flang version-locked libraries (typically `lib${LLVM_LIBDIR_SUFFIX}/<targer-triple>`, but also depends on `APPLE` and `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR`). This code is moved from flang-rt and initially becomes its only user. Refactored out of #171610 as requested [here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2687382481). Extracted `get_runtimes_target_libdir_common` from compiler-rt as requested [here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2689565634). Added TODO comments to all runtimes as requested [here](https://github.com/llvm/llvm-project/pull/171610#issuecomment-3789598635).	2026-04-02 10:32:14 +00:00
Henrich Lauko	57ee29a2a1	[CIR] Implement isMemcpyEquivalentSpecialMember for trivial copy/move ctors (#186700 ) Implements isMemcpyEquivalentSpecialMember in CIR codegen so that trivial copy/move constructors and defaulted union copy/move ops emit a cir.copy directly instead of making a real constructor call. The logic is shared with OG codegen by moving the implementation into ASTContext, where it also gains the pointer field protection (PFP) check that was previously missing in CIR.	2026-04-02 12:31:53 +02:00
Nerixyz	91b90652bb	Reland "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR`" (#189401 ) Initially added in #187709. It was reverted in #188833, because [llvm-clang-x86_64-sie-win](https://lab.llvm.org/buildbot/#/builders/46/builds/32873) was failing in `cross-project-tests/debuginfo-tests/dexter-tests/nrvo.cpp`. The test passed for me locally. After checking on another machine, I found that `S_DEFRANGE_REGISTER_REL_INDIR` is only supported by dbgeng/WinDbg from Windows 10.0 Build 19041 (released 2020) onwards. SDKs before this will fail to read the value. That buildbot is on Windows 10.0 Build 17763. I'm not sure if we should make the generation of that record conditional. Debuggers that can't read the record will skip it. They'll still see that there's some local variable, but won't be able to display the value. As far as I know, users of older Windows 10 builds should be able to install a newer Windows SDK and use the WinDbg from that version. But I haven't tested that.	2026-04-02 12:15:11 +02:00
David Spickett	c329cc59d9	[lldb][test][NFC] Move register command tests (#190144 ) For whatever reason we ended up with register/register but the first register just had the second register folder in it. Move the files up one level so we have register/<test files>.	2026-04-02 11:13:44 +01:00
Ricardo Jesus	9ff2ef9711	[AArch64][SVE] Define pseudos for arithmetic immediate instructions. (#188579 ) This patch uses DestructiveBinaryShImmUnpred (which was previously unused as far as I could tell) to define pseudos for arithmetic immediate instructions such as ADD (immediate), which allows using MOVPRFX with these instructions.	2026-04-02 11:07:46 +01:00
Jiachen Yuan	d0bf354828	[ADT] Reinstate "Refactor Bitset to Be More Constexpr-Usable" (#189497 ) Reland of #172062 (a71b1d2), which was reverted in b0234d1. This patch makes essential Bitset member functions constexpr (`set()`, `any()`, `none()`, `count()`, `operator==`, `!=`, `<`, `\~`) and adds a new `all()` method. It also introduces a `maskLastWord()` invariant to ensure unused high bits in the last word are always zero, which is required for correctness of `operator~`, `set()`, `all()`, and comparisons on non-word-aligned sizes (e.g., `Bitset<33>`). Changes from the original reverted PR: - Replaced `llvm::any_of` with an inline loop to avoid depending on constexpr `any_of`/`none_of` from `STLExtras` (#172536), which was also reverted due to a GCC 15.2.1 bootstrap miscompile. - The patch is now fully self-contained with no prerequisite changes. Motivation: This is a prerequisite for making `LaneBitmask` a wrapper around `Bitset`, enabling scalable lane bitmasks beyond 64 bits (https://discourse.llvm.org/t/rfc-out-of-lanebitmask-bits-again/88613).	2026-04-02 11:50:10 +02:00
Simi Pallipurath	dc9be4ee30	[LLD][ELF] Skip non-inputsections to avoid invalid cast in Arm BE8 handling (#188154 ) This patch fixes https://github.com/llvm/llvm-project/issues/187033 In BE8 mode, instruction bytes are reversed for sections containing code. This logic currently assumes that arm mapping symbols (e.g. $a, $t, $d) are always associated with InputSections. However, mapping symbols can also be defined in other section types such as mergeable sections (SHF_MERGE). These are not represented as InputSection, and attempting to cast them using cast_if_present<InputSection> results in an assertion failure.	2026-04-02 10:16:54 +01:00
Alexandros Lamprineas	4c9a739c5e	[BOLT][AArch64] Strip uneeded labels from FEAT_CMPBR tests. (#189931 ) Eliminates the temporary labels so that BOLT does not recognize them as secondary entry points.	2026-04-02 10:16:41 +01:00
Ramkumar Ramachandra	d835dd2b43	[LV] Strip createStepForVF (NFC) (#185668 ) The mul -> shl simplification is already done in VPlan.	2026-04-02 10:04:37 +01:00
Julian Oppermann	018e048daf	[MLIR][Linalg] Generic to category specialization for unary elementwise ops (#187217 ) Handle specialization of `linalg.generic` ops representing a unary elementwise computation to the `linalg.elementwise` category op. This implements a previously absent path in the linalg morphism.	2026-04-02 10:50:21 +02:00
Elvis Wang	81691d23cd	[RISCV][TTI] Update cost and prevent exceed m8 for vector.extract.last.active (#188160 ) This patch contains two parts. 1. Update costs reflect to the codegen changes. This is not that accurate since the step vector can use smaller type if there is a vscale_range attribute. But we cannot get that in the type-based query in TTI. 2. Return invalid cost for the vector.extract.last.active that needs vector split for the step vector. But currently this is not handled correctly and will hit the assertion. For not blocking the FindLast reduction in LV (https://github.com/llvm/llvm-project/pull/184931). We should land this first and fix the SelectionDAG for vector.extract.last.active lowering.	2026-04-02 16:49:05 +08:00
Sander de Smalen	703d43ca3b	[CostModel] Move default expand cost for partial reductions to BasicTTIImpl (#189905 ) This is a follow-up of the suggestion left here: https://github.com/llvm/llvm-project/pull/181707#discussion_r2995733831 The override functions in AMDGPU/ARM/SystemZ/X86 are required to avoid enabling partial reductions where they were previously disabled (I've added this for all targets that implement getArithmeticReductionCost).	2026-04-02 09:42:53 +01:00
David Spickett	5f6835daf4	[lldb][AArch64][Linux] Qualify uses of user_sve_header (#190130 ) Fixes #165413. Where a build failure was reported: ``` /b/s/w/ir/x/w/llvm-llvm-project/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp:1182:9: error: unknown type name 'user_sve_header'; did you mean 'sve::user_sve_header'? 1182 \| user_sve_header *header = \| ^~~~~~~~~~~~~~~ \| sve::user_sve_header ``` To fix this, add sve:: as we do for all other uses of this. This is LLDB's copy of a structure that Linux also defines. I think the build worked on some machines because that version ended up being included, but with a more isolated build, it may not. We have our own definition of it so we can be sure what we're using in case Linux extends it later.	2026-04-02 08:29:34 +00:00
wanglei	76fc936175	[Clang][LoongArch] Align LSX/LASX built-in signatures with intrinsic types to avoid lax conversions (#189900 ) Update the built-in signatures in BuiltinsLoongArchLSX.def and BuiltinsLoongArchLASX.def to precisely match the vector types used in the corresponding intrinsic headers (lsxintrin.h and lasxintrin.h). This alignment ensures that these intrinsics can be compiled successfully even when -flax-vector-conversions=none is specified, since the built-in arguments no longer rely on implicit vector type conversions. Added new test cases to verify the macro-defined LSX/LASX intrinsic interfaces under -flax-vector-conversions=none. Fixes #189898	2026-04-02 16:11:22 +08:00
Arseniy Zaostrovnykh	e3cfcf48d0	[clang][analyzer] Forward CTU-import failure conditions Forward all CTU-import failures as diagnostics (remarks, warnings, errors), except for `index_error_code::missing_definition` which has the potential of generating too many diagnostics. -- CPP-7804	2026-04-02 07:59:52 +00:00
Gabriel Baraldi	5e0a06b34d	Move ExpandMemCmp and MergeIcmp to the middle end (#77370 ) Moving these into the middle-end pipeline will allow for additional optimization of the expansion result, such as CSE of redundant loads (c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place the passes at the end of the middle-end pipeline, so we mostly don't benefit from additional optimizations yet. The pipeline position will be moved in a future change. This builds on work done by legrosbuffle in https://reviews.llvm.org/D60318. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 09:57:00 +02:00
Zorojuro	a599a06e7c	[libc] Indentation consistency in CMake (#190120 ) This PR just fixes the indentation/style for the whole CMake file for consistency. No other changes. c698f55b0245ffbaae55c7f854fadba33df16e9d	2026-04-02 08:51:52 +01:00
Weibo He	7ccd1cb9a4	Reland "[CoroSplit] Erase trivially dead allocas after spilling (#189295 )" (#190124 ) The original PR contained a use-after-delete issue, which has been resolved in #189521. Reland #189295, which is reverted in #189311	2026-04-02 07:45:13 +00:00
Nikita Popov	1662c200a5	[Passes][LoopRotate] Move minsize handling fully into pass (#189956 ) Make this dependent only on the minsize attribute and drop the pipeline handling. Rename the enable-loop-header-duplication option to enable-loop-header-duplication-at-minsize to clarify that it controls header duplication at minsize only (in other cases it is enabled by default, independently of this option).	2026-04-02 09:32:56 +02:00
Nikita Popov	40e7fa632d	[Passes][FuncSpec] Move optsize/minsize handling into pass (#189952 ) Instead of using the Os/Oz level during pass pipeline construction, query the optsize/minsize attribute on the function to determine whether specialization is allowed to take place. This ensures consistent behavior for per-function attributes. It's worth noting that FuncSpec already checks for minsize, but at the call-site level.	2026-04-02 09:32:39 +02:00
Hans Wennborg	3b81be803f	WholeProgramDevirt: Import/export the CVP byte directly in the summary (#188979 ) rather than using absolute symbol constants on ELF/x86. This leads to better codegen as the absolute symbol constants were not resolved until link time (see bug for example). Fixes #188470	2026-04-02 09:28:32 +02:00
David Rivera	e3cbd9984a	[CIR][AMDGPU] Lower Language specific address spaces and implement AMDGPU target (#179084 )	2026-04-02 03:00:14 -04:00
Fangrui Song	6f9646a598	[ELF] Parallelize --gc-sections mark phase (#189321 ) Add `markParallel` using level-synchronized `parallelFor`. Each BFS level is processed in parallel; newly discovered sections are collected in per-thread queues and merged for the next level. The parallel path is used when `!TrackWhyLive && partitions.size()==1`. `parallelFor` naturally degrades to serial when `--threads=1`. Uses depth-limited inline recursion (depth<3) and optimistic load-then-exchange dedup for best performance. Linking a Release+Asserts clang (--gc-sections, --time-trace) on an old x86-64: 8 threads: markLive 315ms -> 82ms (-234ms). Total 1562ms -> 1350ms (1.16x). 16 threads: markLive 199ms -> 50ms (-149ms). Total 1017ms -> 862ms (1.18x). and on Apple M4: markLive 61ms -> 13ms. Total 317.3ms -> 272.7ms (1.16x).	2026-04-02 06:42:00 +00:00

1 2 3 4 5 ...

575313 Commits