llvm-project

Author	SHA1	Message	Date
Arthur Eubanks	82505fbfc8	[Inliner] Put inline history into IR as !inline_history metadata (#190700 ) (Reland of #190092 with verifier change to look through GlobalAliases) So that it's preserved across all inline invocations rather than just one inliner pass run. This prevents cases where devirtualization in the simplification pipeline uncovers inlining opportunities that should be discarded due to inline history, but we dropped the inline history between inliner pass runs, causing code size to blow up, sometimes exponentially. For compile time reasons, we want to limit this to only call sites that have the potential to inline through SCCs, potentially with the help of devirtualization. This means that the callee is in a non-trivial (Ref)SCC, or the call site was previously an indirect call, which can potentially be devirtualized to call any function. The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant even with this change, as monster_scc.ll blows up if I remove that code. http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u Fixes #186926.	2026-04-06 17:31:43 -07:00
Lucas Ramirez	94875aea7e	[CodeGen] Fix multiple connected component issue in rematerializer (#186674 ) This fixes a rematerializer issue wherein re-creating the interval of a non-rematerializable super-register defined over multiple MIs, some of which defining entirely dead sub-registers, could cause a crash when changing the order of sub-definitions (for example during scheduling) because the re-created interval could end up with multiple connected components, which is illegal. The solution is to split separate components of the interval in such cases. The added unit test crashes without that added behavior.	2026-04-06 23:26:16 +00:00
Andrew	ee51de9836	[llvm-cov] add ability to show non executed test vectors for mc/dc coverage (#187517 ) - Added `-show-mcdc-non-executed-vectors` option - Non-executed test vectors now are tracked - When the opt is present it's get written to UI	2026-04-06 15:59:14 -05:00
Alexis Engelke	89665812f5	[Analysis][NFC] Use block numbers in BlockFrequencyInfo (#190669 ) Block pointers are only stored while constructing the analysis, so the value handle to catch erased blocks is no longer needed when using stable block numbers.	2026-04-06 20:47:34 +00:00
Congzhe	fbe6d79465	[LoopFusion] Fix out-of-date LoopInfo being used during fusion (#189452 ) This is fix for [187902](https://github.com/llvm/llvm-project/issues/187902), where `LoopInfo` is not in a valid state at the beginning of `ScalarEvolution::createSCEVIter`. The reason for the bug is that, `mergeLatch()` is called at a place where control flow and dominator trees have been updated but `LoopInfo` has not completed the update yet. `mergeLatch()` calls into `ScalarEvolution` that uses `LoopInfo`, where out-of-date `LoopInfo` would result in crash or unpredictable results. This patch moves `mergeLatch()` to the place where `LoopInfo` has completed its update and hence is in a valid state.	2026-04-06 16:35:28 -04:00
Steven Wu	1a0ca1019d	[CAS] Harden validate() against on-disk corruption (#190634 ) Fixes found by fuzzer: OnDiskTrieRawHashMap: - Bounds-check data slot offsets in TrieVerifier::visitSlot() before calling getRecord(), preventing asData() assertion on out-of-bounds trie entries. - Validate subtrie headers (NumBits, bounds) before constructing SubtrieHandle, preventing SEGV in getSlots() from corrupt NumBits. - Validate arena bump pointer alignment, catching misaligned BumpPtr that would crash store() with an alignment assertion. - Fix comma operator bug in getOrCreateRoot() where the compare_exchange_strong result was discarded, causing asSubtrie() assertion when RootTrieOffset was corrupted to zero. OnDiskGraphDB: - Reject invalid (zero) ref offsets in validate callback, preventing asData() assertion when corrupt data pool refs are resolved via recoverFromFileOffset(). - Validate DataRecordHandle layout flags before calling getTotalSize(), preventing llvm_unreachable on corrupt NumRefsFlags/DataSizeFlags. - Validate data pool bump pointer alignment, catching misaligned BumpPtr that would crash store() in DataRecordHandle::constructImpl(). - Check data record refs offset alignment before calling getRefs(), preventing PointerUnion assertion from misaligned refs pointer. MappedFileRegionArena: - Convert assertions in initializeHeader() to errors so corrupted arena headers return an error on CAS open instead of crashing. Assisted-By: Claude	2026-04-06 13:33:22 -07:00
Arthur Eubanks	70d3dcaa64	Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190666 ) Reverts llvm/llvm-project#190092 Crashes reported in https://github.com/llvm/llvm-project/pull/190092#issuecomment-4194546908	2026-04-06 20:31:54 +00:00
Lucas Ramirez	5e1162eebc	[CodeGen] Move rollback capabilities outside of the rematerializer (#184341 ) The rematerializer implements support for rolling back rematerializations by modifying MIs that should normally be deleted in an attempt to make them "transparent" to other analyses. This involves: 1. setting their opcode to DBG_VALUE and 2. setting their read register operands to the sentinel register. This approach has several drawbacks. 1. It forces the rematerializer to support tracking these "dead MIs" (even if support is optional, these data-structures have to exist). 2. It is not actually clear whether this mechanism will interact well with all other analyses. This is an issue since the intent of the rematerializer is to be usable in as many contexts as possible. 3. In practice, it has shown itself to be relatively error-prone. This commit removes rollback support from the rematerializer and moves those capabilities to a rematerializer listener than can be instantiated on-demand and implements the same functionality on top of standard rematerializer operations. The rematerializer now actually deletes MIs that are no longer useful after rematerializations, and has support for re-creating them on-demand without requiring additional tracking on its part.	2026-04-06 19:23:19 +00:00
Daniel Thornburgh	fecf609998	Reland "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916 )" (#190642 ) This reverts commit 1ec7e86b3a779df2a0af3f37e58c8f5b3a398d7f after issue #190072 was fixed.	2026-04-06 19:20:45 +00:00
Alexis Engelke	a105f27f61	[Scheduler][NFC] Don't use set to track visited nodes (#190480 ) The visited set can grow rather large and we can use an unused field in SDNode to store the same information without the use of a hash set. This improves compile times: stage2-O3 -0.14%.	2026-04-06 18:37:26 +00:00
vporpo	8d442bc5b5	[SandboxVec][LoadStoreVec] Add support for constants (#189769 ) Up until now the pass would only vectorize load-store pairs. This patch implements vectorization of constant-store pairs.	2026-04-06 11:25:20 -07:00
Tomer Shafir	37801e9e99	[MCA] Enhance debug prints of processor resources (#190132 ) Previously, `computeProcResourceMasks()` would print resource masks on debug mode from multiple call sites, creating noise in the debug output. This patch aims to fix this and also print more info about the resources. It splits to 2 types of debug prints for resources: 1. No simulation - mask only 2. Simulation - mask + other info For 2, it shares printing on a single place in `ResourceManager` constructor, that should cover all the other simulation cases indirectly: 1. `llvm/lib/MCA/HardwareUnits/ResourceManager` - covered 2. `llvm/lib/MCA/InstrBuilder.c` - should be covered indirectly - only used by `llvm-mca` before simulation that constructs a `ResourceManager` 3. `llvm/tools/llvm-mca/Views/SummaryView.cpp` - after simulation that constructs a `ResourceManager` 4. `llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp` - after simulation that constructs a `ResourceManager` It also adds `BufferSize` to the output, which should be useful to debug scheduling model + MCA integration. For 1, it inlines mask-only printing into 2 other callers: 1. `llvm/include/llvm/MCA/Stages/InstructionTables.h` 2. `llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp` as they only use the masks there. I think this is a reasonable duplication across distinguishably different users/tools. Now every pair of callers, even across groups (1 and 2), effectively print in a mutually exclusive way. The patch adds debug tests for the 3 new callers, in the corresponding root test directories, to drive further location of logically target-independent tests that just require some target at the root. I think this convention is more discoverable, and is pretty widely used in the project.	2026-04-06 20:27:18 +03:00
Arthur Eubanks	72d4ce9889	[Inliner] Put inline history into IR as !inline_history metadata (#190092 ) So that it's preserved across all inline invocations rather than just one inliner pass run. This prevents cases where devirtualization in the simplification pipeline uncovers inlining opportunities that should be discarded due to inline history, but we dropped the inline history between inliner pass runs, causing code size to blow up, sometimes exponentially. For compile time reasons, we want to limit this to only call sites that have the potential to inline through SCCs, potentially with the help of devirtualization. This means that the callee is in a non-trivial (Ref)SCC, or the call site was previously an indirect call, which can potentially be devirtualized to call any function. The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant even with this change, as monster_scc.ll blows up if I remove that code. http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u Fixes #186926.	2026-04-06 10:24:41 -07:00
Ryotaro Kasuga	34a16392fa	[DA] Use SmallVector instead of raw new/delete (NFC) (#190586 ) Some functions used `new`/`delete` to allocate/free arrays. To avoid memory leaks, it would be better to avoid using raw pointers. This patch replaces the use of them with `SmallVector`.	2026-04-06 15:54:34 +00:00
Max Graey	c4281fd5af	[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559 ) Improve KnownFPClass reasoning for fadd: - Refine NaN handling for infinities by checking opposite-sign cases: - `-inf` + `+inf` --> `nan` - `+inf` + `-inf` --> `nan` - `+inf` + `+inf` --> `+inf` - `-inf` + `-inf` --> `-inf` - Introduce `cannotBeOrderedLessEqZero` as pair to `cannotBeOrderedGreaterEqZero`.	2026-04-06 16:23:20 +02:00
Florian Hahn	ff4c6fe24e	[SCEV] Move NoWrapFlags definition outside SCEV scope, use for SCEVUse. (#190199 ) The patch moves out of SCEV's scope so they can be re-used for SCEVUse. SCEVUse gets an additional getNoWrapFlags helper that returns the union of the expressions SCEV flags and the use-specific flags. SCEVExpander has been updated to use this new helper. In order to avoid other changes, the original names are exposed via constexpr in SCEV. Not sure if there's a nicer way. One alternative would be to define the enum in struct, and have SCEV inherit from it. The patch also clarifies that the SCEVUse flags encode NUW/NSW, and hides getInt, setInt, etc to avoid potential mis-use. PR: https://github.com/llvm/llvm-project/pull/190199	2026-04-04 15:03:36 +00:00
Jinsong Ji	ee405335f0	DiagnosticInfo: Fix stack-use-after-scope in DiagnosticInfoStackSize (#190442 ) The string literal "stack frame size" passed to the base class constructor created a temporary Twine that was destroyed after the base constructor completed, leaving a dangling reference. Fix by storing the Twine as a member variable in the derived class, ensuring it lives as long as the diagnostic object itself. Fixes ASAN stack-use-after-scope error in Clang :: Misc/backend-stack-frame-diagnostics-fallback.cpp LLVM :: CodeGen/X86/2007-04-24-Huge-Stack.ll LLVM :: CodeGen/X86/huge-stack-offset.ll LLVM :: CodeGen/X86/huge-stack-offset2.ll LLVM :: CodeGen/X86/huge-stack.ll LLVM :: CodeGen/X86/large-displacements.ll LLVM :: CodeGen/X86/stack-clash-extra-huge.ll LLVM :: CodeGen/X86/warn-stack.ll LLVM :: CodeGen/X86/win64-stackprobe-overflow.ll	2026-04-04 14:52:54 +00:00
Alan Li	5e0efc0f1d	Reland "[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107 )" (#188502 ) This is a reland of https://github.com/llvm/llvm-project/pull/155107 along with a fix for old gcc builds. This patch is reverted in https://github.com/llvm/llvm-project/pull/188344 due to compilation failures described in https://github.com/llvm/llvm-project/pull/155107#issuecomment-4121292756 The fix to old gcc builds is to remove `constexpr` modifiers in the original patch in 0721d8e7768c011b8cf2d4d223ca6eca3392b1f9	2026-04-04 05:57:13 -07:00
Zachary Yedidia	e6e388cff0	[LFI][MC] Call setLFIRewriter during LFIMCStreamer initialization (#188625 ) Calls `Streamer.setLFIRewriter` during generic LFIMCStreamer initialization rather than requiring it to be done during backend-specific initialization. This better follows the existing conventions in `create*` functions in `TargetRegistry.h`. Also re-adds the call to initSections for LFI in `llvm-mc.cpp` (necessary in order to emit the ABI Note section), along with a test to make sure ABI note emission with the rewriter is working.	2026-04-03 22:02:30 -07:00
Joseph Huber	8a8434f22a	[OpenMP] Move alloc / free shared from TLI to alloc tags (#190365 ) Summary: Allocation kinds were added after these were introduced. We only needed the TLI to identify these in the attributor so we can now just use attributes. Update the usage in OpenMP and drop the TLI interface. Fixes: https://github.com/llvm/llvm-project/issues/190072	2026-04-03 15:15:48 -05:00
Alexis Engelke	4a7213867a	[Analysis] No block map in MemoryDependenceAnalysis (#190367 ) Avoid expensive hash map of block to value by using a vector. To avoid allocating and clearing the entire vector per query, cache the allocation and use an epoch to identify stale values from previous queries.	2026-04-03 22:13:11 +02:00
vporpo	94545a7c63	[SandboxVec][Legality][NFC] Outline differentBlock() and areUnique() (#190024 ) And reuse them in LoadStoreVec.	2026-04-03 12:14:55 -07:00
Wei Wang	f33e9faa5d	[SampleProfile] Fix FuncMappings key mismatch for renamed functions in stale profile matching (#187899 ) Fix a bug where `distributeIRToProfileLocationMap` fails to find location mappings from IR to profile for renamed functions because `FuncMappings` is indexed by the IR function name while `distributeIRToProfileLocationMap` looks up by the profile function name. Fixed by making `FuncMappings` to use profile function name as key.	2026-04-03 11:38:51 -07:00
Simon Pilgrim	6832709dc0	[DAG] SDPatternMatch - rename m_Opc -> m_SpecificOpc (#190215 ) Match naming convention for other m_Specific* matchers, and frees up the m_Opc() matcher for future use in #84940 to allow us to capture the opcode of a unknown binop Moving to m_SpecificOpc does mess up the formatting in a few places, I've tried to refactor to use the m_Value(SDValue, ....) matcher where I can to retrieve some whitespace	2026-04-03 18:03:00 +00:00
Valeriy Savchenko	853ea940ae	[InstCombine][NFC] Expose isKnownExactCastIntToFP as a public method (#190327 )	2026-04-03 18:15:49 +01:00
Osman Yasar	150042141c	[GlobalISel] Add `sub(-1, x) -> (xor x, -1)` from SelectionDAG (#181014 ) This PR adds the pattern `// (sub -1, x) -> (xor x, -1)` to GlobalISel from SelectionDAG. Original SelectionDAG rewrite: `5b4811eddb/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L4305)` --------- Co-authored-by: Jay Foad <jay.foad@gmail.com>	2026-04-03 17:53:30 +01:00
Yuta Saito	fd65b3ef77	[GlobalISel] Fix UMR in `SwiftErrorValueTracking` (#190273 ) Fix issue reported on https://github.com/llvm/llvm-project/pull/188296#issuecomment-4179103756 `SwiftErrorValueTracking` holds per-function state used by `IRTranslator`. On targets where `TargetLowering::supportSwiftError()` is false, (e.g. wasm) `SwiftErrorValueTracking::setFunction()` exits early. Historically, that early return happened before clearing per-function containers, and pointer members (including `SwiftErrorArg`) had no in-class initialization. The bad case is a function with a swifterror argument on such a target: `IRTranslator` uses `SwiftError.getFunctionArg()` without checking `supportSwiftError()` and this could read an uninitialized `SwiftErrorArg` value. (SelectionDAG gates the `getFunctionArg` usages behind `supportSwiftError()`, so it's specific to GlobalISel) 29391328ab66 added [a first test case](llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-swiftcc.ll) that satisfies: - the target is `supportSwiftError` = false - use swiftcc - use GlobalISel and it made the issue observable with sanitizer builds. This commit fixes the per-function container reinitialization and defensively add explicit pointer member initializations.	2026-04-03 14:33:35 +01:00
theRonShark	00aede8f19	Revert "[Clang][OpenMP] Implement Loop splitting `#pragma omp split` directive " (#190335 ) Reverts llvm/llvm-project#183261 15 new lit tests failing in openmp	2026-04-03 12:27:07 +00:00
Matt Arsenault	273e8d85fe	DiagnosticInfo: Fix missing LLVM_LIFETIME_BOUND on Twine arguments (#190331 ) Fix use after free errors in DiagnosticInfoResourceLimit uses.	2026-04-03 11:08:00 +00:00
Ryotaro Kasuga	9e516f5c58	[MachinePipeliner] Remove isLoopCarriedDep and use DDG (#174394 ) This patch completely removes `isLoopCarriedDep`, which was used previously to identify loop-carried dependencies in the DAG. Now that we have the DDG representation, this special handling is no longer necessary. Simply replacing its usage with the DDG causes several tests to fail, since cycle detection takes some of the validation-only edges in the DDG into account. To address this, this patch introduces extra edges in the DDG, which are used only for cycle detection and not for other parts of the pass (e.g., scheduling). The extra edges are determined to preserve the existing behavior of the pass as closely as possible, which makes the predicates for adding them somewhat complex. Split off from #135148, and the final patch in the series for #135148	2026-04-03 10:36:34 +00:00
Amit Tiwari	1972cf64fd	[Clang][OpenMP] Implement Loop splitting `#pragma omp split` directive (#183261 ) OpenMP 6.0 Loop-splitting directive `#pragma omp split` construct with `counts` clause	2026-04-03 10:42:31 +05:30
Matt Arsenault	a68ae7b0cc	DiagnosticInfo: Use Twine for resource name (#190228 ) Allow more flexibility in phrasing of the overallocated resource.	2026-04-02 21:41:38 +02:00
Rahul Joshi	43a8b7de88	[NFC][LLVM] Rename several `ArgsTys` arguments to `OverloadTys`. (#190210 ) Rename several arguments to intrinsic related functions from `ArgsTys` to `OverloadTys` to better reflect their meaning. The only variables left with name `ArgTys` now actually mean function argument types. Also reamove an incorrect comment in Intrinsics.td. Dependent types do allow forward references starting with `7957fc6547`	2026-04-02 12:36:28 -07:00
Prabhu Rajasekaran	e97a42d5f9	[Driver] UEFI -mno-incremental-linker-compatible (#188800 ) The `-mno-incremental-linker-compatible` switch translates to Brepro linker flag and must be passed on to the underlying linker to match the behavior of the Windows triples that produce PE COFF.	2026-04-02 19:11:28 +00:00
Steven Perron	6331bfa41a	[HLSL] Add GetDimensions to Texture2D. (#189991 ) This commit add the GetDimensions methods to Texture2D. For DXIL, it requires intrinsics that are not yet available. They are added, but not implemented. Assisted-by: Gemini Co-authored-by: Helena Kotas <hekotas@microsoft.com>	2026-04-02 18:26:02 +00:00
Steven Perron	5124dd2536	[SPIRV] Add get dimension intrinsics. (#189746 ) Add the intrinsics in the wg-hlsl proposal [[0033] - GetDimensions mapping to built-ins functions and LLVM intrinsics](https://github.com/llvm/wg-hlsl/blob/main/proposals/0033-resources-get-dimensions.md#lowering-to-spir-v) to the SPIR-V backend. This enabled us to implement the GetDimensions methods in textures in Clang. Assisted-by: Gemini	2026-04-02 13:14:34 -04:00
Florian Hahn	97dbf38c9c	[SCEVExpander] Add SCEVUseVisitor and use it in SCEVExpander (NFC) (#188863 ) Add SCEVUseVisitor, a new visitor class where all visit methods receive a SCEVUse instead of a const SCEV*. Use it for SCEVExpander, so it can use use-specific flags in the future. PR: https://github.com/llvm/llvm-project/pull/188863	2026-04-02 15:08:01 +00:00
Ryotaro Kasuga	682a217d74	[DA] Extract the logic shared by the Exact SIV/RDIV test (#189951 ) The Exact SIV test and the Exact RDIV test behave almost identically, except that the Exact SIV test also explores the directions in the final step. This patch consolidates the two duplicate implementations into a single function that can be used by both tests. While this change slightly affects things like debug output and metrics, it is not intended to alter the actual test results.	2026-04-02 13:30:42 +00:00
Rahul Joshi	99786f20ee	[LLVM][Intrinsics] Refactor `IITDescriptor` (#190011 ) The main change is to eliminate the use of "Argument" terminology when dealing with overloaded types since overloaded types can be either argument or return values, and some additional renaming for clarity. 1. Rename `Tys` argument to various intrinsic APIs to `OverloadTys` to better reflect its meaning. 2. Rename `IITDescriptorKind::Argument` to `IITDescriptorKind::Overloaded` to better convey that it's an overloaded type. Removed "Argument" suffix for other kinds for dependent types. 3. Rename `ArgKind` to `AnyKind`, `getArgumentNumber` to `getOverloadIndex`, `getArgumentKind` to `getOverloadKind`, `getRefArgNumber` to `getRefOverloadIndex`, and `IIT_ARG` to `IIT_ANY`. 4. Rename `IIT_ANYPTR` (used to represent a pointer qualified with address space) to `IIT_PTR_AS` to clearly distinguish it from `llvm_anyptr_ty` 5. Change the packing of [ref overload index & overload index] for `VecOfAnyPtrsToElt` to pack the overload index into the lower bits, so we can use the `getOverloadIndex` function to get the overload index.	2026-04-02 06:19:01 -07:00
Steven Perron	905f23c9f8	[HLSL] Add CalculateLevelOfDetail methods to Texture2D (#188574 ) This adds the CalculateLevelOfDetail and CalculateLevelOfDetailUnclamped methods to Texture2D using the establish pattern used for other methods. Assisted-by: Gemini	2026-04-02 08:58:11 -04:00
Nerixyz	91b90652bb	Reland "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR`" (#189401 ) Initially added in #187709. It was reverted in #188833, because [llvm-clang-x86_64-sie-win](https://lab.llvm.org/buildbot/#/builders/46/builds/32873) was failing in `cross-project-tests/debuginfo-tests/dexter-tests/nrvo.cpp`. The test passed for me locally. After checking on another machine, I found that `S_DEFRANGE_REGISTER_REL_INDIR` is only supported by dbgeng/WinDbg from Windows 10.0 Build 19041 (released 2020) onwards. SDKs before this will fail to read the value. That buildbot is on Windows 10.0 Build 17763. I'm not sure if we should make the generation of that record conditional. Debuggers that can't read the record will skip it. They'll still see that there's some local variable, but won't be able to display the value. As far as I know, users of older Windows 10 builds should be able to install a newer Windows SDK and use the WinDbg from that version. But I haven't tested that.	2026-04-02 12:15:11 +02:00
Jiachen Yuan	d0bf354828	[ADT] Reinstate "Refactor Bitset to Be More Constexpr-Usable" (#189497 ) Reland of #172062 (a71b1d2), which was reverted in b0234d1. This patch makes essential Bitset member functions constexpr (`set()`, `any()`, `none()`, `count()`, `operator==`, `!=`, `<`, `\~`) and adds a new `all()` method. It also introduces a `maskLastWord()` invariant to ensure unused high bits in the last word are always zero, which is required for correctness of `operator~`, `set()`, `all()`, and comparisons on non-word-aligned sizes (e.g., `Bitset<33>`). Changes from the original reverted PR: - Replaced `llvm::any_of` with an inline loop to avoid depending on constexpr `any_of`/`none_of` from `STLExtras` (#172536), which was also reverted due to a GCC 15.2.1 bootstrap miscompile. - The patch is now fully self-contained with no prerequisite changes. Motivation: This is a prerequisite for making `LaneBitmask` a wrapper around `Bitset`, enabling scalable lane bitmasks beyond 64 bits (https://discourse.llvm.org/t/rfc-out-of-lanebitmask-bits-again/88613).	2026-04-02 11:50:10 +02:00
Sander de Smalen	703d43ca3b	[CostModel] Move default expand cost for partial reductions to BasicTTIImpl (#189905 ) This is a follow-up of the suggestion left here: https://github.com/llvm/llvm-project/pull/181707#discussion_r2995733831 The override functions in AMDGPU/ARM/SystemZ/X86 are required to avoid enabling partial reductions where they were previously disabled (I've added this for all targets that implement getArithmeticReductionCost).	2026-04-02 09:42:53 +01:00
Gabriel Baraldi	5e0a06b34d	Move ExpandMemCmp and MergeIcmp to the middle end (#77370 ) Moving these into the middle-end pipeline will allow for additional optimization of the expansion result, such as CSE of redundant loads (c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place the passes at the end of the middle-end pipeline, so we mostly don't benefit from additional optimizations yet. The pipeline position will be moved in a future change. This builds on work done by legrosbuffle in https://reviews.llvm.org/D60318. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 09:57:00 +02:00
Lang Hames	3346a76d32	[JITLink] Remove unnecessary SymbolStringPtr copy. (#190101 ) This was probably intended to be a `const SymbolStringPtr&` originally, but if we were going to copy it anyway it's better to just take the argument by value and std::move it.	2026-04-02 15:53:42 +11:00
Fangrui Song	8daaa26efd	[Support] Support nested parallel TaskGroup via work-stealing (#189293 ) Nested TaskGroups run serially to prevent deadlock, as documented by https://reviews.llvm.org/D61115 and refined by https://reviews.llvm.org/D148984 to use threadIndex. Enable nested parallelism by having worker threads actively execute tasks from the work queue while waiting (work-stealing), instead of just blocking. Root-level TaskGroups (main thread) keep the efficient blocking Latch::sync(), so there is no overhead for the common non-nested case. In lld, https://reviews.llvm.org/D131247 worked around the limitation by passing a single root TaskGroup into OutputSection::writeTo and spawning 4MB-chunked tasks into it. However, SyntheticSection::writeTo calls with internal parallelism (e.g. GdbIndexSection, MergeNoTailSection) still ran serially on worker threads. With this change, their internal parallelFor/parallelForEach calls parallelize automatically via helpSync work-stealing. The increased parallelism can reorder error messages from parallel phases (e.g. relocation processing during section writes), so one lld test is updated to use --threads=1 for deterministic output.	2026-04-01 19:20:16 -07:00
Mirko Brkušanin	5d9eb0c76a	[AMDGPU] Define new targets gfx1171 and gfx1172 (#187735 )	2026-04-01 18:16:11 +02:00
Lucas Ramirez	54914a4287	[CodeGen] Allow rematerializer to rematerialize at the end of a block (#184339 ) This makes the rematerializer able to rematerialize MIs at the end of a basic block. We achieve this by tracking the parent basic block of every region inside the rematerializer and adding an explicit target region to some of the class's methods. The latter removes the requirement that we track the MI of every region (`Rematerializer::MIRegion`) after the analysis phase; the class member is therefore deleted. This new ability will be used shortly to improve the design of the rollback mechanism.	2026-04-01 16:58:44 +02:00
Kai Nacke	9b00518419	[MC] Introduce new base class for MCAsmStreamer (#187083 ) The class MCAsmBaseStreamer serves as the common base class for streamers which emit assembly output. It has the same role as MCObjectStreamer has for streams which emits object files.	2026-04-01 10:19:08 -04:00
Steven Perron	9dc8f465a4	[SPIRV] Implement the int_spv_resource_calculate_lod* IntrinsicsSPIRV (#188337 ) Implements intrinsics used to get the level-of-detail given a texture, sampler, and a coordinate. It will be used to implement the corresponding HLSL methods. Assisted-by: Gemini	2026-04-01 14:03:42 +00:00

1 2 3 4 5 ...

62628 Commits