llvm-project

Author	SHA1	Message	Date
Trung Nguyen	b6e7c475cb	[CodeGen] Ignore `ANNOTATION_LABEL` in scheduler (#190499 ) This fixes a crash in `clang` for `armv7` targets when optimizations are enabled. Fixes #190497	2026-04-06 14:16:01 +02:00
Kartik Ohlan	7c60d08056	[DAG] computeKnownFPClass - add ISD::SPLAT_VECTOR handling (#189780 ) Fixes #189481 Implement ISD::SPLAT_VECTOR in SelectionDAG::computeKnownFPClass to correctly propagate floating-point properties from scalar operands to vectors. Added AArch64 and RISC-V test coverage	2026-04-04 14:54:12 +00:00
Alan Li	5e0efc0f1d	Reland "[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107 )" (#188502 ) This is a reland of https://github.com/llvm/llvm-project/pull/155107 along with a fix for old gcc builds. This patch is reverted in https://github.com/llvm/llvm-project/pull/188344 due to compilation failures described in https://github.com/llvm/llvm-project/pull/155107#issuecomment-4121292756 The fix to old gcc builds is to remove `constexpr` modifiers in the original patch in 0721d8e7768c011b8cf2d4d223ca6eca3392b1f9	2026-04-04 05:57:13 -07:00
Craig Topper	c7824ac669	[TargetLowering] Remove stale comment. NFC (#190275 ) Missed removing in #188653	2026-04-03 14:26:09 -07:00
Simon Pilgrim	6832709dc0	[DAG] SDPatternMatch - rename m_Opc -> m_SpecificOpc (#190215 ) Match naming convention for other m_Specific* matchers, and frees up the m_Opc() matcher for future use in #84940 to allow us to capture the opcode of a unknown binop Moving to m_SpecificOpc does mess up the formatting in a few places, I've tried to refactor to use the m_Value(SDValue, ....) matcher where I can to retrieve some whitespace	2026-04-03 18:03:00 +00:00
Craig Topper	5d08beaec8	[TargetLowering] Remove NeedToApplyOffset from prepareSREMEqFold. NFC (#190256 ) For a given element, I believe A is only 0 when the divisor is INT_MIN. The only way for NeedToApplyOffset to be false after processing all elements, is for all divisors to be INT_MIN. If all divisors are INT_MIN, then all divisors are a power of 2 and we wouldn't do the transform.	2026-04-03 07:32:13 -07:00
Yuta Saito	fd65b3ef77	[GlobalISel] Fix UMR in `SwiftErrorValueTracking` (#190273 ) Fix issue reported on https://github.com/llvm/llvm-project/pull/188296#issuecomment-4179103756 `SwiftErrorValueTracking` holds per-function state used by `IRTranslator`. On targets where `TargetLowering::supportSwiftError()` is false, (e.g. wasm) `SwiftErrorValueTracking::setFunction()` exits early. Historically, that early return happened before clearing per-function containers, and pointer members (including `SwiftErrorArg`) had no in-class initialization. The bad case is a function with a swifterror argument on such a target: `IRTranslator` uses `SwiftError.getFunctionArg()` without checking `supportSwiftError()` and this could read an uninitialized `SwiftErrorArg` value. (SelectionDAG gates the `getFunctionArg` usages behind `supportSwiftError()`, so it's specific to GlobalISel) 29391328ab66 added [a first test case](llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-swiftcc.ll) that satisfies: - the target is `supportSwiftError` = false - use swiftcc - use GlobalISel and it made the issue observable with sanitizer builds. This commit fixes the per-function container reinitialization and defensively add explicit pointer member initializations.	2026-04-03 14:33:35 +01:00
Simon Pilgrim	5674755cb6	[DAG] visitMUL - cleanup pattern matchers to use m_Shl and (commutative) m_Mul directly (#190339 ) Based on feedback on #190215	2026-04-03 13:21:51 +00:00
Simon Pilgrim	15ed4f6c49	[DAG] isKnownToBeAPowerOfTwo - add missing DemandedElts handling to ISD::TRUNCATE and hidden m_Neg pattern (#190190 ) Use MaskedVectorIsZero to match X & -X pattern when only DemandedElts match the negation pattern Fixes #181654 (properly)	2026-04-03 12:03:33 +00:00
Ryotaro Kasuga	9e516f5c58	[MachinePipeliner] Remove isLoopCarriedDep and use DDG (#174394 ) This patch completely removes `isLoopCarriedDep`, which was used previously to identify loop-carried dependencies in the DAG. Now that we have the DDG representation, this special handling is no longer necessary. Simply replacing its usage with the DDG causes several tests to fail, since cycle detection takes some of the validation-only edges in the DDG into account. To address this, this patch introduces extra edges in the DDG, which are used only for cycle detection and not for other parts of the pass (e.g., scheduling). The extra edges are determined to preserve the existing behavior of the pass as closely as possible, which makes the predicates for adding them somewhat complex. Split off from #135148, and the final patch in the series for #135148	2026-04-03 10:36:34 +00:00
Craig Topper	e2e5db8401	[TargetLowering] Speculative fix for a non-determinism issue between different compilers. (#190219 ) The evaluation order of function arguments is unspecified by the C++ standard. We had two getNode calls as function arguments which causes the nodes to be created in a different order depending on the compiler used. This patch moves them to their own variables to ensure they are called in the same order on all compilers. Possible fix for #190148.	2026-04-02 12:12:28 -07:00
Craig Topper	24146ce5cf	[TargetLowering] Remove INT_MIN special case from prepareSREMEqFold. (#188653 ) If the divisor is INT_MIN, we can still treat it like any other power of 2. We'll fold i32 (seteq (srem X, INT_MIN)) to (setule (rotr (add (mul X, 1), INT_MIN), 31), 1). Alive2 says this is correct https://alive2.llvm.org/ce/z/vjzqKk. The multiply is a NOP, the add toggles the sign bits. The rotate puts the lowest 31 bits of into the upper 31 bits. The sign bit is now in the LSB. The compare checks if the upper 31 bits are 0. srem X, INT_MIN has a remainder of 0 if X is 0 or INT_MIN which is equivalent to checking if the uppper 31 bits are 0 after the rotate. I don't think we need to add any constant for power of 2 but toggling the sign bit like we do now doesn't hurt.	2026-04-02 09:45:47 -07:00
zGoldthorpe	e9a62c7698	[DAG] `computeKnownFPClass`: handle `ISD::FABS` (#190069 ) Use `KnownFPClass::fabs` to handle `ISD::FABS`. This case will help with updating #188356 to use `computeKnownFPClass`.	2026-04-02 14:48:54 +00:00
dibrinsofor	eaa3ef9ddc	[DAG] Propagate OrZero and DemandedElts for min/max in isKnownToBeAPowerOfTwo (#182369 ) Fixes #181643 For queries like `isKnownToBeAPowerOfTwo(V, OrZero=true)`, if an operand is known to be "pow2-or-zero" but not strictly non-zero power-of-two, the min/max case currently returns false even when the result remains pow2-or-zero. For instance: - `A = select cond, 4, 0` (A is pow2-or-zero) - `R = umin(A, 16)` `R` is always in `{0, 4}` and querying `isKnownToBeAPowerOfTwo(R, OrZero=true)` should be true. Added unitests for baseline and failing case and now propagating correctly to `OrZero` and `DemandedElts`	2026-04-02 12:50:11 +01:00
Nerixyz	91b90652bb	Reland "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR`" (#189401 ) Initially added in #187709. It was reverted in #188833, because [llvm-clang-x86_64-sie-win](https://lab.llvm.org/buildbot/#/builders/46/builds/32873) was failing in `cross-project-tests/debuginfo-tests/dexter-tests/nrvo.cpp`. The test passed for me locally. After checking on another machine, I found that `S_DEFRANGE_REGISTER_REL_INDIR` is only supported by dbgeng/WinDbg from Windows 10.0 Build 19041 (released 2020) onwards. SDKs before this will fail to read the value. That buildbot is on Windows 10.0 Build 17763. I'm not sure if we should make the generation of that record conditional. Debuggers that can't read the record will skip it. They'll still see that there's some local variable, but won't be able to display the value. As far as I know, users of older Windows 10 builds should be able to install a newer Windows SDK and use the WinDbg from that version. But I haven't tested that.	2026-04-02 12:15:11 +02:00
Gabriel Baraldi	5e0a06b34d	Move ExpandMemCmp and MergeIcmp to the middle end (#77370 ) Moving these into the middle-end pipeline will allow for additional optimization of the expansion result, such as CSE of redundant loads (c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place the passes at the end of the middle-end pipeline, so we mostly don't benefit from additional optimizations yet. The pipeline position will be moved in a future change. This builds on work done by legrosbuffle in https://reviews.llvm.org/D60318. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 09:57:00 +02:00
David Green	083f9c158a	[AArch64][GISel] Widen non-power2 element sizes for ctlz. (#189371 ) This addresses an illegal mutation kind, where gisel would hit an assert. It expands vector elements for non-power2 elements or elements less that i8 to a power of 2. A fix to handle vector types correctly was needed in LegalizerHandler. Fixes #185411	2026-04-02 07:27:12 +01:00
zGoldthorpe	9a354fc5a1	[SelectionDAG] Use `KnownBits` to determine if an operand may be NaN. (#188606 ) Given a bitcast into a fp type, use the known bits of the operand to infer whether the resulting value can never be NaN.	2026-04-01 22:47:01 -06:00
zGoldthorpe	d6d0876d1a	[NFC][SelectionDAG] Refactor out common default `DemandedElts` calculation (#190031 ) Deduplicating the repeated pattern ```cpp APInt DemandedElts = VT.isFixedLengthVector() ? APInt::getAllOnes(VT.getVectorNumElements()) : APInt(1, 1); ``` in SelectionDAG.	2026-04-01 14:40:48 -06:00
zGoldthorpe	24b6ee90c1	[SelectionDAG] Assert on non-FP operand to `computeKnownFPClass` (#189752 ) Assert correct usage of `computeKnownFPClass` or users (i.e., `isKnownNeverNaN`).	2026-04-01 17:41:33 +00:00
zGoldthorpe	d7e129dffb	[SelectionDAGBuilder] Only check VPCmp for NaNs in fp comparisons (#189749 ) `getFCmpCodeWithoutNaN` should only be used for FP comparisons (which is also the only context in which `isKnownNeverNaN` makes sense).	2026-04-01 17:00:55 +00:00
LU-JOHN	c245d764b8	[CodeGen] Do not remove IMPLICIT_DEF unless all uses have undef flag added (#188133 ) Do not remove IMPLICIT_DEF of a physreg unless all uses have an undef flag added. Previously, only the first use instruction had undef flags added. This will cause a failure in machine instruction verification. Multi-instruction uses tested in AMDGPU/multi-use-implicit-def.mir and X86/multi-use-implicit-def.mir. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2026-04-01 10:11:42 -05:00
Lucas Ramirez	54914a4287	[CodeGen] Allow rematerializer to rematerialize at the end of a block (#184339 ) This makes the rematerializer able to rematerialize MIs at the end of a basic block. We achieve this by tracking the parent basic block of every region inside the rematerializer and adding an explicit target region to some of the class's methods. The latter removes the requirement that we track the MI of every region (`Rematerializer::MIRegion`) after the analysis phase; the class member is therefore deleted. This new ability will be used shortly to improve the design of the rollback mechanism.	2026-04-01 16:58:44 +02:00
Pankaj Dwivedi	86c3abe85e	[NFC] Rename InstructionUniformity to ValueUniformity (#189935 )	2026-04-01 19:28:33 +05:30
DaKnig	d6b8163f3f	Retry "[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801 )" (#186659 ) A better version of #175801 . see that for more info. Fixes #185467 The original patch was checking the correctness of the transformation based on the original Op1 , which was then negated (in the case of IsAdd). This patch fixes that issue by inverting the sign bit in that case. Also pushed a slight nfc there to simplify the code and remove some duplication. alive2 proofs: abds: https://alive2.llvm.org/ce/z/oJQPss abdu: https://alive2.llvm.org/ce/z/HfPF5q Note that the regression test is not (wrongly) affected anymore by the patch (as it did before)	2026-04-01 13:37:29 +00:00
Gergo Stomfai	15d48c5bbe	[X86][DAG] remove LowerFCanonicalize (#188127 ) Remove LowerFCanonicalize. Added fallback for cases when the scalar type also has its Custom lowering to avoid regressions on AMDGPU and SystemZ. Fixes #143862	2026-04-01 13:34:05 +00:00
Simon Pilgrim	9a33125e42	[DAG] Add basic ISD::IS_FPCLASS constant/identity folds (#189944 ) Attempts to match middle-end implementation in InstructionSimplify/foldIntrinsicIsFPClass Fixes #189919	2026-04-01 13:06:27 +00:00
Lucas Ramirez	8a06085c61	[CodeGen] Add listener support to the rematerializer (NFC) (#184338 ) This change adds support for adding listeners to the target-independent rematerializer; listeners can catch certain rematerialization-related events to implement some additional functionality on top of what the rematerializer already performs. This is NFC and has no user at the moment, but the plan is to have listeners start being responsible for secondary/optional functionalities that are at the moment integrated with the rematerializer itself. Two examples of that are: 1. rollback support (currently optional), and 2. region tracking (currently mandatory, but not fundamentally necessary to the rematerializer).	2026-04-01 13:35:37 +02:00
Luke Lau	effcd181e5	[RISCV] Remove codegen for VP float rounding intrinsics (#189896 ) Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off seven intrinsics from #179622. We now generate vfcvt.rtz for llvm.vp.roundtozero. It looks like we should have been using the codegen for llvm.trunc for it, but we somehow missed that.	2026-04-01 11:04:53 +00:00
Luke Lau	1d549d9a77	[RISCV] Remove codegen for vp_lrint, vp_llrint (#189714 ) Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off two intrinsics from #179622. We need to use the other intrinsic constructor in ExpandVectorPredication.cpp because llrint has multiple overloaded types	2026-04-01 06:46:38 +00:00
Henry Jiang	bf50489eeb	[Psuedoprobe][MachO] Enable pseudo probes emission for MachO (#185758 ) Enable pseudo probes emission for MachO. Due to the 16 character limit of MachO segment and section, the file sections will be `__PSEUDO_PROBE,__probes` and `__PSEUDO_PROBE,__probe_descs`.	2026-03-31 16:27:58 -07:00
Craig Topper	b7dc4ff0ab	[TargetLowering] Replace always true if with an assert. NFC (#189750 ) We already returned for UADDSAT/USUBSAT leaving SADDSAT/SSUBSAT as the only opcodes that can get here.	2026-03-31 15:21:04 -07:00
Yonah Goldberg	bf76fa7582	[AtomicExpandPass][NFC] Refactor processAtomicInstr to be more readable (#186547 ) While working on https://discourse.llvm.org/t/rfc-add-elementwise-modifier-to-atomicrmw/90134/5 I found this `processAtomicInstr` to be a little hard to read, with casing on the instruction type all over the place. I think it reads nicer to just case on the instruction type once.	2026-03-31 12:22:03 -07:00
Laxman Sole	da173bfbf5	[NVPTX] Do not emit .debug_pubnames and .debug_pubtypes for NVPTX backend (#187328 ) This change adds a mechanism to stop emitting `.debug_pubname`, `.debug_pubtypes` sections for a particular target. This is particularly useful for cases where IR is generated by frontends that do not explicitly disable these sections (as `Clang` does for `NVPTX`), but still use `llc` for code generation. Currently, only `NVPTX` uses this to disable these sections.	2026-03-31 12:13:39 -07:00
Medha Tiwari	9c64cb6dca	Fix emulated TLS alignment for large variables (#171037 ) Fix emulated TLS alignment for larger variables (>= 32 bytes) to use preferred alignment. Fixes #167219	2026-03-31 09:37:06 -07:00
Luke Lau	e891812cac	[RISCV] Remove codegen for vp_minimum, vp_maximum (#189550 ) Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off two intrinsics from #179622.	2026-03-31 15:12:18 +00:00
natanelh-mobileye	46dd9d6f52	[SDAG][abd] Combine abd of small types (#181538 ) It is beneficial to combine abd of illegal, small types (types that get promoted to wider scalar size).	2026-03-31 13:40:51 +00:00
Jay Foad	fbfb83978c	[MachineVerifier] Disallow subregister defs in SSA form (#189403 )	2026-03-31 09:50:08 +01:00
Luke Lau	598f3535fa	[SelectionDAG] Expand CTTZ_ELTS[_ZERO_POISON] and handle legalization (#188691 ) This is a second attempt at "[SelectionDAG] Expand CTTZ_ELTS[_ZERO_POISON] and handle splitting" (#188220) That PR had to be reverted in 7d39664a6ae8daaf186b65578492244d96a50bf2 because we had crashes on AMDGPU since we didn't have scalarization support, and other crashes on PowerPC because we didn't handle the case when a vector needed widened. Tests for these are added in AMDGPU/cttz-elts.ll, RISCV/rvv/cttz-elts-scalarize.ll and PowerPC/cttz-elts.ll. The former crash has been fixed by adding DAGTypeLegalizer::ScalarizeVecOp_CTTZ_ELTS. The second crash has been fixed by reworking TargetLowering::expandCttzElts. The expansion for CTTZ_ELTS is nearly identical to VECTOR_FIND_LAST_ACTIVE, except it uses a reverse step vector and subtracts the result from VF. The easiest way to fix these crashes without introducing regressions is to reuse the VECTOR_FIND_LAST_ACTIVE expansion which already handles the case where the vector needs widened. This means that the node now needs to take in a boolean vector argument and uses VSELECT instead of an AND to zero out inactive lanes, so the op promotion code has also been shared.	2026-03-31 07:25:57 +00:00
Mingjie Xu	227edfb2f4	[CodeGenPrepare][NFC] Reland: Update the dominator tree instead of rebuilding it (#179040 ) The original differential revision is https://reviews.llvm.org/D153638 Reverted in `f5b5a30858` because of causing a clang crash. This patch relands it with the crash fixed. Call `DTU->flush()` in each iteration of `while (MadeChange)` loop, flush all awaiting BasicBlocks deletion, and prevent iterator invalidation.	2026-03-31 09:01:11 +08:00
Demetrius Kanios	96bd7b6e15	[CodeGen] Add additional params to `TargetLoweringBase::getTruncStoreAction` (#187422 ) The truncating store analogue of #181104. Adds `Alignment` and `AddrSpace` parameters to `TargetLoweringBase::getTruncStoreAction` and dependents, and introduces a `getCustomTruncStoreAction` hook for targets to customize legalization behavior using this new information. This change is fully backwards compatible from the target's point of view, with `setTruncStoreAction` having identical functionality. The change is purely additive.	2026-03-30 16:52:45 -07:00
Simon Pilgrim	d74f098a30	[DAG] isKnownNeverNaN - fallback to computeKnownFPClass check (#189476 ) Remove ConstantFPSDNode handling from isKnownNeverNaN and fallback to using computeKnownFPClass if there are no opcode matches in isKnownNeverNaN The test check changes are due to isKnownNeverNaN not handling UNDEF/POISON but computeKnownFPClass does (POISON in particular now returns isKnownNeverNaN == true, preventing a ISD::FCANONICALIZE call in expandFMINNUM_FMAXNUM).	2026-03-30 21:49:15 +00:00
Bill Wendling	9d3079a7a9	[NFC][CodeGen] Prepare for expansion of InlineAsmPrepare (#189469 ) Move some functions around so that the CallBrInst processing is contained. The 'static' functions don't need to be declared at the top; just place them before the calls. Fix the naming to use lower-case for the first letter of function names.	2026-03-30 20:54:00 +00:00
Alexey Merzlyakov	06725d7ef5	[GISel] Keep non-negative info in SUB(CTLZ) (#189314 ) Implement non-negative value tracking for SUB-CTLZ chains in GlobalISel, matching the behavior previously added to SelectionDAG. Additionally, refactor the SelectionDAG implementation from the previous patch to improve performance and code density. Related to https://github.com/llvm/llvm-project/issues/136516 and https://github.com/llvm/llvm-project/pull/186338#discussion_r2980420174	2026-03-30 22:10:47 +02:00
Aiden Grossman	9331b5bb77	[DAG] Fix -Wunused-variable A recently introduced local is only used in an assertion which means we get -Wunused-variable in release+noasserts builds. Mark it [[maybe_unused]] rather than inlinine the definition given there are multiple uses within the assert.	2026-03-30 17:51:42 +00:00
Alexis Engelke	bbef10d9f1	[CodeGen][NFC] Compute MaximumLegalStoreInBits just once (#189355 ) Instead of iterating over all value types per basic block, pre-compute the TLI-specific value once when constructing the TLI.	2026-03-30 18:44:18 +02:00
Anshul Nigham	7feb816ed0	[NFC] Removes unused Combiner dependency on TargetPassConfig (#188365 ) This enables NewPM ports since it removes multiple pass dependencies on `TargetPassConfig` which we don't want to port to the NewPM. It looks like no derived classes of Combiner actually use this pointer, and it is also unused in the Combiner class.	2026-03-30 08:58:22 -07:00
Xinlong Chen	aa22fca59a	[DAG] Add initial version of SelectionDAG::computeKnownFPClass (#188790 ) This patch adds an initial skeleton for `SelectionDAG::computeKnownFPClass`. The initial version includes: - DemandedElts wrapper and max depth early-out - `ConstantFPSDNode` and `BUILD_VECTOR` handling - `TargetLowering::computeKnownFPClassForTargetNode` virtual hook for backend extensions Initial test coverage for constant scalars, BUILD_VECTOR, and max depth early-out is added in `AArch64SelectionDAGTest.cpp`. closes #175571	2026-03-30 14:08:44 +00:00
Simon Pilgrim	7382a993b4	[DAG] SimplifyDemandedBits - limit BITCAST -> FGETSIGN fold to custom/legal scalar SimplifyDemandedBits cases (#189363 ) All of the non-i32 zero_extend codepath is unaffected by this Pulled out of the discussion on #189129	2026-03-30 14:02:05 +00:00
Jim Lin	2b41985405	[DAG] Fix incorrect ForSigned handling in computeConstantRange calls (#188889 ) Fix two places where ForSigned was incorrectly passed to computeConstantRange, causing wrong signed/unsigned range computation. In computeConstantRangeIncludingKnownBits (DemandedElts overload), the call omitted ForSigned, so Depth (unsigned) was implicitly converted to bool for the ForSigned parameter. Introduced in a6a66a4e6915. In visitIMINMAX, the call always passed ForSigned=false, even when folding SMAX/SMIN which query signed bounds from the resulting range.	2026-03-30 10:30:19 +00:00

1 2 3 4 5 ...

39449 Commits