llvm-project

Author	SHA1	Message	Date
Michael Kruse	afb80bddf1	[Runtimes] Introduce variables containing resource dir paths (#177953 ) Introduce common infrastructure for runtimes that determines compiler resource path locations. These variables introduced are: * RUNTIMES_OUTPUT_RESOURCE_DIR * RUNTIMES_INSTALL_RESOURCE_PATH That contain the location for the compiler resource path (typically `lib/clang/<version>`) in the build tree and the install tree (the latter relative to CMAKE_INSTALL_PREFIX). Additionally, define * RUNTIMES_OUTPUT_RESOURCE_LIB_DIR * RUNTIMES_INSTALL_RESOURCE_LIB_PATH as for the location of clang/flang version-locked libraries (typically `lib${LLVM_LIBDIR_SUFFIX}/<targer-triple>`, but also depends on `APPLE` and `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR`). This code is moved from flang-rt and initially becomes its only user. Refactored out of #171610 as requested [here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2687382481). Extracted `get_runtimes_target_libdir_common` from compiler-rt as requested [here](https://github.com/llvm/llvm-project/pull/171610#discussion_r2689565634). Added TODO comments to all runtimes as requested [here](https://github.com/llvm/llvm-project/pull/171610#issuecomment-3789598635).	2026-04-02 10:32:14 +00:00
Henrich Lauko	57ee29a2a1	[CIR] Implement isMemcpyEquivalentSpecialMember for trivial copy/move ctors (#186700 ) Implements isMemcpyEquivalentSpecialMember in CIR codegen so that trivial copy/move constructors and defaulted union copy/move ops emit a cir.copy directly instead of making a real constructor call. The logic is shared with OG codegen by moving the implementation into ASTContext, where it also gains the pointer field protection (PFP) check that was previously missing in CIR.	2026-04-02 12:31:53 +02:00
Nerixyz	91b90652bb	Reland "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR`" (#189401 ) Initially added in #187709. It was reverted in #188833, because [llvm-clang-x86_64-sie-win](https://lab.llvm.org/buildbot/#/builders/46/builds/32873) was failing in `cross-project-tests/debuginfo-tests/dexter-tests/nrvo.cpp`. The test passed for me locally. After checking on another machine, I found that `S_DEFRANGE_REGISTER_REL_INDIR` is only supported by dbgeng/WinDbg from Windows 10.0 Build 19041 (released 2020) onwards. SDKs before this will fail to read the value. That buildbot is on Windows 10.0 Build 17763. I'm not sure if we should make the generation of that record conditional. Debuggers that can't read the record will skip it. They'll still see that there's some local variable, but won't be able to display the value. As far as I know, users of older Windows 10 builds should be able to install a newer Windows SDK and use the WinDbg from that version. But I haven't tested that.	2026-04-02 12:15:11 +02:00
David Spickett	c329cc59d9	[lldb][test][NFC] Move register command tests (#190144 ) For whatever reason we ended up with register/register but the first register just had the second register folder in it. Move the files up one level so we have register/<test files>.	2026-04-02 11:13:44 +01:00
Ricardo Jesus	9ff2ef9711	[AArch64][SVE] Define pseudos for arithmetic immediate instructions. (#188579 ) This patch uses DestructiveBinaryShImmUnpred (which was previously unused as far as I could tell) to define pseudos for arithmetic immediate instructions such as ADD (immediate), which allows using MOVPRFX with these instructions.	2026-04-02 11:07:46 +01:00
Jiachen Yuan	d0bf354828	[ADT] Reinstate "Refactor Bitset to Be More Constexpr-Usable" (#189497 ) Reland of #172062 (a71b1d2), which was reverted in b0234d1. This patch makes essential Bitset member functions constexpr (`set()`, `any()`, `none()`, `count()`, `operator==`, `!=`, `<`, `\~`) and adds a new `all()` method. It also introduces a `maskLastWord()` invariant to ensure unused high bits in the last word are always zero, which is required for correctness of `operator~`, `set()`, `all()`, and comparisons on non-word-aligned sizes (e.g., `Bitset<33>`). Changes from the original reverted PR: - Replaced `llvm::any_of` with an inline loop to avoid depending on constexpr `any_of`/`none_of` from `STLExtras` (#172536), which was also reverted due to a GCC 15.2.1 bootstrap miscompile. - The patch is now fully self-contained with no prerequisite changes. Motivation: This is a prerequisite for making `LaneBitmask` a wrapper around `Bitset`, enabling scalable lane bitmasks beyond 64 bits (https://discourse.llvm.org/t/rfc-out-of-lanebitmask-bits-again/88613).	2026-04-02 11:50:10 +02:00
Simi Pallipurath	dc9be4ee30	[LLD][ELF] Skip non-inputsections to avoid invalid cast in Arm BE8 handling (#188154 ) This patch fixes https://github.com/llvm/llvm-project/issues/187033 In BE8 mode, instruction bytes are reversed for sections containing code. This logic currently assumes that arm mapping symbols (e.g. $a, $t, $d) are always associated with InputSections. However, mapping symbols can also be defined in other section types such as mergeable sections (SHF_MERGE). These are not represented as InputSection, and attempting to cast them using cast_if_present<InputSection> results in an assertion failure.	2026-04-02 10:16:54 +01:00
Alexandros Lamprineas	4c9a739c5e	[BOLT][AArch64] Strip uneeded labels from FEAT_CMPBR tests. (#189931 ) Eliminates the temporary labels so that BOLT does not recognize them as secondary entry points.	2026-04-02 10:16:41 +01:00
Ramkumar Ramachandra	d835dd2b43	[LV] Strip createStepForVF (NFC) (#185668 ) The mul -> shl simplification is already done in VPlan.	2026-04-02 10:04:37 +01:00
Julian Oppermann	018e048daf	[MLIR][Linalg] Generic to category specialization for unary elementwise ops (#187217 ) Handle specialization of `linalg.generic` ops representing a unary elementwise computation to the `linalg.elementwise` category op. This implements a previously absent path in the linalg morphism.	2026-04-02 10:50:21 +02:00
Elvis Wang	81691d23cd	[RISCV][TTI] Update cost and prevent exceed m8 for vector.extract.last.active (#188160 ) This patch contains two parts. 1. Update costs reflect to the codegen changes. This is not that accurate since the step vector can use smaller type if there is a vscale_range attribute. But we cannot get that in the type-based query in TTI. 2. Return invalid cost for the vector.extract.last.active that needs vector split for the step vector. But currently this is not handled correctly and will hit the assertion. For not blocking the FindLast reduction in LV (https://github.com/llvm/llvm-project/pull/184931). We should land this first and fix the SelectionDAG for vector.extract.last.active lowering.	2026-04-02 16:49:05 +08:00
Sander de Smalen	703d43ca3b	[CostModel] Move default expand cost for partial reductions to BasicTTIImpl (#189905 ) This is a follow-up of the suggestion left here: https://github.com/llvm/llvm-project/pull/181707#discussion_r2995733831 The override functions in AMDGPU/ARM/SystemZ/X86 are required to avoid enabling partial reductions where they were previously disabled (I've added this for all targets that implement getArithmeticReductionCost).	2026-04-02 09:42:53 +01:00
David Spickett	5f6835daf4	[lldb][AArch64][Linux] Qualify uses of user_sve_header (#190130 ) Fixes #165413. Where a build failure was reported: ``` /b/s/w/ir/x/w/llvm-llvm-project/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp:1182:9: error: unknown type name 'user_sve_header'; did you mean 'sve::user_sve_header'? 1182 \| user_sve_header *header = \| ^~~~~~~~~~~~~~~ \| sve::user_sve_header ``` To fix this, add sve:: as we do for all other uses of this. This is LLDB's copy of a structure that Linux also defines. I think the build worked on some machines because that version ended up being included, but with a more isolated build, it may not. We have our own definition of it so we can be sure what we're using in case Linux extends it later.	2026-04-02 08:29:34 +00:00
wanglei	76fc936175	[Clang][LoongArch] Align LSX/LASX built-in signatures with intrinsic types to avoid lax conversions (#189900 ) Update the built-in signatures in BuiltinsLoongArchLSX.def and BuiltinsLoongArchLASX.def to precisely match the vector types used in the corresponding intrinsic headers (lsxintrin.h and lasxintrin.h). This alignment ensures that these intrinsics can be compiled successfully even when -flax-vector-conversions=none is specified, since the built-in arguments no longer rely on implicit vector type conversions. Added new test cases to verify the macro-defined LSX/LASX intrinsic interfaces under -flax-vector-conversions=none. Fixes #189898	2026-04-02 16:11:22 +08:00
Arseniy Zaostrovnykh	e3cfcf48d0	[clang][analyzer] Forward CTU-import failure conditions Forward all CTU-import failures as diagnostics (remarks, warnings, errors), except for `index_error_code::missing_definition` which has the potential of generating too many diagnostics. -- CPP-7804	2026-04-02 07:59:52 +00:00
Gabriel Baraldi	5e0a06b34d	Move ExpandMemCmp and MergeIcmp to the middle end (#77370 ) Moving these into the middle-end pipeline will allow for additional optimization of the expansion result, such as CSE of redundant loads (c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place the passes at the end of the middle-end pipeline, so we mostly don't benefit from additional optimizations yet. The pipeline position will be moved in a future change. This builds on work done by legrosbuffle in https://reviews.llvm.org/D60318. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 09:57:00 +02:00
Zorojuro	a599a06e7c	[libc] Indentation consistency in CMake (#190120 ) This PR just fixes the indentation/style for the whole CMake file for consistency. No other changes. c698f55b0245ffbaae55c7f854fadba33df16e9d	2026-04-02 08:51:52 +01:00
Weibo He	7ccd1cb9a4	Reland "[CoroSplit] Erase trivially dead allocas after spilling (#189295 )" (#190124 ) The original PR contained a use-after-delete issue, which has been resolved in #189521. Reland #189295, which is reverted in #189311	2026-04-02 07:45:13 +00:00
Nikita Popov	1662c200a5	[Passes][LoopRotate] Move minsize handling fully into pass (#189956 ) Make this dependent only on the minsize attribute and drop the pipeline handling. Rename the enable-loop-header-duplication option to enable-loop-header-duplication-at-minsize to clarify that it controls header duplication at minsize only (in other cases it is enabled by default, independently of this option).	2026-04-02 09:32:56 +02:00
Nikita Popov	40e7fa632d	[Passes][FuncSpec] Move optsize/minsize handling into pass (#189952 ) Instead of using the Os/Oz level during pass pipeline construction, query the optsize/minsize attribute on the function to determine whether specialization is allowed to take place. This ensures consistent behavior for per-function attributes. It's worth noting that FuncSpec already checks for minsize, but at the call-site level.	2026-04-02 09:32:39 +02:00
Hans Wennborg	3b81be803f	WholeProgramDevirt: Import/export the CVP byte directly in the summary (#188979 ) rather than using absolute symbol constants on ELF/x86. This leads to better codegen as the absolute symbol constants were not resolved until link time (see bug for example). Fixes #188470	2026-04-02 09:28:32 +02:00
David Rivera	e3cbd9984a	[CIR][AMDGPU] Lower Language specific address spaces and implement AMDGPU target (#179084 )	2026-04-02 03:00:14 -04:00
Fangrui Song	6f9646a598	[ELF] Parallelize --gc-sections mark phase (#189321 ) Add `markParallel` using level-synchronized `parallelFor`. Each BFS level is processed in parallel; newly discovered sections are collected in per-thread queues and merged for the next level. The parallel path is used when `!TrackWhyLive && partitions.size()==1`. `parallelFor` naturally degrades to serial when `--threads=1`. Uses depth-limited inline recursion (depth<3) and optimistic load-then-exchange dedup for best performance. Linking a Release+Asserts clang (--gc-sections, --time-trace) on an old x86-64: 8 threads: markLive 315ms -> 82ms (-234ms). Total 1562ms -> 1350ms (1.16x). 16 threads: markLive 199ms -> 50ms (-149ms). Total 1017ms -> 862ms (1.18x). and on Apple M4: markLive 61ms -> 13ms. Total 317.3ms -> 272.7ms (1.16x).	2026-04-02 06:42:00 +00:00
David Green	083f9c158a	[AArch64][GISel] Widen non-power2 element sizes for ctlz. (#189371 ) This addresses an illegal mutation kind, where gisel would hit an assert. It expands vector elements for non-power2 elements or elements less that i8 to a power of 2. A fix to handle vector types correctly was needed in LegalizerHandler. Fixes #185411	2026-04-02 07:27:12 +01:00
Fangrui Song	6a87416162	[ELF] Move Symbol::used to atomic flags field (#190117 ) Move the `used` bitfield into the existing `std::atomic<uint16_t> flags`, making it safe for concurrent access from parallel GC mark (#189321).	2026-04-01 23:21:13 -07:00
Paul Kirth	802d4631e0	[clang-doc] Update lookup routines for consistency (#190043 ) When filtering is enabled, its possible an Info doesn't have a Parent USR. Use `find()` to safely handle that case. Additionally, I noticed the comparison code for the index poorly reimplemented the existing comparison from StringRef. We can just use the one from ADT.	2026-04-01 23:17:42 -07:00
Craig Topper	68cbcf7ec2	[RISCV] Check EnsureWholeVectorRegisterMoveValidVTYPE in RISCVInsertVSETVLI::transferBefore. (#190022 ) Fixes #189786	2026-04-01 23:14:38 -07:00
Fangrui Song	2118499a89	[ELF] Decouple SharedFile::isNeeded from GC mark. NFC (#190112 ) ... out of the per-relocation resolveReloc and into a post-GC scan of global symbols. This decouples the --as-needed logic from the mark algorithm, simplifying the imminent parallel GC mark.	2026-04-01 22:42:51 -07:00
Luke Lau	2a7ca3a3fa	[RISCV] Remove codegen for vp_ctlz, vp_cttz, vp_ctpop (#189904 ) Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off 3 intrinsics from #179622. Note that vp.cttz is the elementwise version, not vp.cttz.elts.	2026-04-02 05:26:41 +00:00
Fangrui Song	0bde74ab04	[ELF] Pass SectionPiece by reference in getSectionPiece. NFC (#190110 ) The generated assembly looks more optimized. In addition, this avoids widened load, which would cause a TSan-detected data race with parallel --gc-sections (#189321).	2026-04-01 22:07:42 -07:00
Lang Hames	3346a76d32	[JITLink] Remove unnecessary SymbolStringPtr copy. (#190101 ) This was probably intended to be a `const SymbolStringPtr&` originally, but if we were going to copy it anyway it's better to just take the argument by value and std::move it.	2026-04-02 15:53:42 +11:00
zGoldthorpe	9a354fc5a1	[SelectionDAG] Use `KnownBits` to determine if an operand may be NaN. (#188606 ) Given a bitcast into a fp type, use the known bits of the operand to infer whether the resulting value can never be NaN.	2026-04-01 22:47:01 -06:00
Chaitanya	dbc206f35d	[CIR][CIRGen] Support for section atttribute on cir.global (#188200 ) Upstreaming clangIR PR: https://github.com/llvm/clangir/pull/422 This PR implement support for `__attribute__((section("name")))` on global variables in ClangIR, matching OGCG behavior.	2026-04-02 09:58:17 +05:30
Diego Novillo	06aae40c6d	[HLSL][SPIRV] Restore support for -g to generate NSDI (#190007 ) The original attempt (#187051) produced a regression for `intel-sycl-gpu` because `SPIRVEmitNonSemanticDI` will now self-activate whenever `llvm.dbg.cu` is present. This removed the need for the explicit `--spv-emit-nonsemantic-debug-info` flag. The pass is now entered unconditionally for all SPIR-V targets, but `NonSemantic.Shader.DebugInfo.100` requires the `SPV_KHR_non_semantic_info`. Targets like `spirv64-intel` do not enable that extension by default. When `checkSatisfiable()` ran on those targets, it issued a fatal error rather than silently skipping. Adds an early-out from `emitGlobalDI()`: if `SPV_KHR_non_semantic_info` is not available for the current target, the pass returns without emitting anything.	2026-04-01 21:00:36 -07:00
Sudharsan Veeravalli	18a065763d	[RISCV] Move unpaired instruction back in RISCVLoadStoreOptimizer (#189912 ) There are cases when the `Xqcilsm` vendor extension is enabled that we are unable to pair non-adjacent load/store instructions. The `RISCVLoadStoreOptimizer` moves the instruction adjacent to the other before attempting to pair them but does not move them back when it fails. This can sometimes prevent the generation of the `Xqcilsm` load/store multiple instructions. This patch ensures that we move the unpaired instruction back to it's original location.	2026-04-02 09:18:58 +05:30
wangjue	8c2feea2f7	[BOLT] Delete unnecessary instructions (#189297 )	2026-04-02 06:48:38 +03:00
yebinchon	495e1a4257	[mlir] added a check in the walk to prevent catching a cos in a nested region (#190064 ) The walk in SincosFusion may detect a cos within a nested region of the sin block. This triggers an assertion in `isBeforeInBlock` later on. Added a check within the walk so it filters operations in nested regions, which are not in the same block and should not be fused anyway. --------- Co-authored-by: Yebin Chon <ychon@nvidia.com>	2026-04-01 20:10:56 -07:00
lntue	d52daeac79	[libc] Fix the remaining long double issue in shared_math_test.cpp. (#190098 )	2026-04-01 22:47:29 -04:00
Simon Pilgrim	c8c7186b46	[X86] LowerRotate - expand vXi8 non-uniform variable rotates using uniform constant rotates (#189986 ) We expand vXi8 non-uniform variable rotates as a sequence of uniform constant rotates along with a SELECT depending on whether the original rotate amount needs it This patch removes premature uniform constant rotate expansion to the OR(SHL,SRL) sequences to allow GFNI targets to use single VGF2P8AFFINEQB calls	2026-04-02 02:30:59 +00:00
Fangrui Song	8daaa26efd	[Support] Support nested parallel TaskGroup via work-stealing (#189293 ) Nested TaskGroups run serially to prevent deadlock, as documented by https://reviews.llvm.org/D61115 and refined by https://reviews.llvm.org/D148984 to use threadIndex. Enable nested parallelism by having worker threads actively execute tasks from the work queue while waiting (work-stealing), instead of just blocking. Root-level TaskGroups (main thread) keep the efficient blocking Latch::sync(), so there is no overhead for the common non-nested case. In lld, https://reviews.llvm.org/D131247 worked around the limitation by passing a single root TaskGroup into OutputSection::writeTo and spawning 4MB-chunked tasks into it. However, SyntheticSection::writeTo calls with internal parallelism (e.g. GdbIndexSection, MergeNoTailSection) still ran serially on worker threads. With this change, their internal parallelFor/parallelForEach calls parallelize automatically via helpSync work-stealing. The increased parallelism can reorder error messages from parallel phases (e.g. relocation processing during section writes), so one lld test is updated to use --threads=1 for deterministic output.	2026-04-01 19:20:16 -07:00
Anshul Nigham	dee982d6c8	[NewPM] Adds a port for AArch64PostCoalescerPass (#189520 ) Adds a standard porting for AArch64PostCoalescer to NewPM.	2026-04-01 19:18:18 -07:00
Anshul Nigham	e27e7e4339	[NFC][AAarch64] Remove PreLegalizerCombiner pass dependency on TargetPassConfig (#190073 ) This will enable NewPM porting. Replaced with the definition in [AArch64PassConfig::getCSEConfig](`1d549d9a77/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp (L614)`)	2026-04-01 19:09:37 -07:00
Chuanqi Xu	c97e08e331	[C++20] [Modules] Add VisiblePromoted module ownership kind (#189903 ) This patch adds a new ModuleOwnershipKind::VisiblePromoted to handle declarations that are not visible to the current TU but are promoted to be visible to avoid re-parsing. Originally we set the visible visiblity directly in such cases. But https://github.com/llvm/llvm-project/issues/188853 shows such decls may be excluded later if we import #include and then import. So we have to introduce a new visibility to express the intention that the visibility of the decl is intentionally promoted. Close https://github.com/llvm/llvm-project/issues/188853	2026-04-02 10:01:32 +08:00
lntue	096f9d0aa8	[libc] Initial support so that libc-shared-tests can be built with pp64le (#188882 )	2026-04-01 20:55:44 -04:00
Zhaoxuan Jiang	fd609e5d33	[lld] Glob-based BP compression sort groups (#185661 ) Add --bp-compression-sort-section=<glob>[=<layout_priority>[=<match_priority>]] to let users split input sections into multiple compression groups, run balanced partitioning independently per group, and leave out sections that are poor candidates for BP. This replaces the old coarse --bp-compression-sort with a more explicit, user-controlled one. In ELF, the glob matches input section names (.text.unlikely.cold1). In Mach-O, it matches the concatenated segment+section name (__TEXT__text). layout_priority controls group placement in the final layout. match_priority resolves conflicts when multiple globs match the same section: explicit priority beats positional matching, and among positional specs the last match wins. A CRTP hook getCompressionSubgroupKey() allows backends to further subdivide glob groups into independent BP instances. This allows Mach-O backend to separate cold functions via N_COLD_FUNC in the future. The deprecated --bp-compression-sort option keeps its existing function/data behavior by assigning sections to fixed legacy groups.	2026-04-01 17:53:08 -07:00
Jim Lin	3d7eedce56	[RISCV] Fix stackmap shadow trimming NOP size for compressed targets (#189774 ) The shadow trimming loop in LowerSTACKMAP hardcoded a 4-byte decrement per instruction, but when Zca is enabled NOPs are 2 bytes. Use NOPBytes instead of the hardcoded 4 so the shadow is correctly trimmed on compressed targets. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-02 08:21:33 +08:00
Jim Lin	b9e01c26f0	[RISCV] Relax VL constraint in convertSameMaskVMergeToVMv (#189797 ) When converting a PseudoVMERGE_VVM to PseudoVMV_V_V, we previously required MIVL <= TrueVL to avoid losing False elements in the tail. Relax this constraint when the vmerge's False operand equals its Passthru operand and the True instruction's tail policy is TU (tail undisturbed). In this case, True's tail lanes preserve its passthru value (which equals False and Passthru), so the conversion is safe even when MIVL > TrueVL. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 08:12:48 +08:00
Christopher Ferris	7c260d3966	[scudo] Fix reallocate for MTE. (#190086 ) For MTE, we can't use the whole size or we might trigger a segfault. Therefore, use the exact size when MTE is enabled or the exact usable size parameter is true. Also, optimize out the call to getUsableSize and use a simpler calculation.	2026-04-01 16:44:31 -07:00
Demetrius Kanios	29391328ab	[WebAssembly][GlobalISel] CallLowering `lowerFormalArguments` (#180263 ) Implements `WebAssemblyCallLowering::lowerFormalArguments` Split from #157161	2026-04-01 16:12:38 -07:00
Zorojuro	52fb23eef8	[libc][math] Remove static from log1pf implementation (#190042 ) Reflecting changes according to `823e3e0017`	2026-04-01 19:01:44 -04:00

1 2 3 4 5 ...

575286 Commits