llvm-project

Author	SHA1	Message	Date
Florian Hahn	f8734a5e10	[SCEV] Introduce SCEVUse, use it instead of const SCEV * (NFCI). (#91961 ) This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV , plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to https://github.com/llvm/llvm-project/pull/90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV . It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=ee34eb6edccdebc2a752ffecdde5faae6b0d5593&to=5a7727d7819414d2acbc5b6ab740f0fc2363e842&stat=instructions%3Au	2026-03-13 16:23:06 +00:00
Alexis Engelke	94da4039cb	[Analysis][NFC] Drop use of BranchInst (#186374 ) Largely straight-forward replacement.	2026-03-13 13:42:19 +00:00
Florian Hahn	e8908215de	[LSR] Support SCEVPtrToAddr in SCEVDbgValueBuilder. Allow SCEVPtrToAddr as cast in assertion in SCEVDbgValueBuilder. SCEVPtrToAddr is handled similarly to SCEVPtrToInt. Fixes a crash with debug info after bd40d1de9c9ee, which started to generate ptrtoaddr instead of ptrtoint expressions.	2026-02-07 14:02:45 +00:00
Austin Jiang	e6cdfb75ac	Fix typos and spelling errors across codebase (#156270 ) Corrected various spelling mistakes such as 'occurred', 'receiver', 'initialized', 'length', and others in comments, variable names, function names, and documentation throughout the project. These changes improve code readability and maintain consistency in naming and documentation. Co-authored-by: Louis Dionne <ldionne.2@gmail.com>	2026-01-13 11:52:46 -05:00
Rahul Joshi	7d96b39c4f	[NFC][LLVM] Adopt ListSeparator/interleaved in more places (#172909 ) Adopt `ListSeparator` and `interleaved` in various places instead of manual code to print separators between loop iterations.	2026-01-12 12:18:01 -08:00
Nikita Popov	8fd85ba9e6	[LLVM] Temporarily allow implicit truncation in some places Split out from https://github.com/llvm/llvm-project/pull/171456. This explicitly allows implicit truncation in a number of places, prior to switching the default. This limits the scope of the initial change.	2026-01-05 09:52:57 +01:00
Ramkumar Ramachandra	85fafd5db0	[SCEVExp] Get DL from SE, strip constructor arg (NFC) (#171823 )	2025-12-11 14:26:47 +00:00
Nikita Popov	6960b633ee	[LSR] Use getSigned() for negated immediate	2025-12-09 16:19:36 +01:00
John Brawn	ccd4e7b1ed	[LSR] Make OptimizeLoopTermCond able to handle some non-cmp conditions (#165590 ) Currently OptimizeLoopTermCond can only convert a cmp instruction to using a postincrement induction variable, which means it can't handle predicated loops where the termination condition comes from get_active_lane_mask. Relax this restriction so that we can handle any kind of instruction, though only if it's the instruction immediately before the branch (except for possibly an extractelement).	2025-12-03 15:28:46 +00:00
John Brawn	2ad71745cd	[LSR] Insert the transformed IV increment in the user block (#169515 ) Currently we try to hoist the transformed IV increment instruction to the header block to help with generation of postincrement instructions, but this only works if the user instruction is also in the header. We should instead be trying to insert it in the same block as the user.	2025-12-02 17:15:00 +00:00
John Brawn	53e7443e0c	[LSR] Don't count conditional loads/store as enabling pre/post-index (#159573 ) When a load/store is conditionally executed in a loop it isn't a candidate for pre/post-index addressing, as the increment of the address would only happen on those loop iterations where the load/store is executed. Detect this and only discount the AddRec cost when the load/store is unconditional.	2025-10-30 13:53:15 +00:00
John Brawn	8fab81121e	[LSR] Add an addressing mode that considers all addressing modes (#158110 ) The way that loops strength reduction works is that the target has to upfront decide whether it wants its addressing to be preindex, postindex, or neither. This choice affects: * Which potential solutions we generate * Whether we consider a pre/post index load/store as costing an AddRec or not. None of these choices are a good fit for either AArch64 or ARM, where both preindex and postindex addressing are typically free: * If we pick None then we count pre/post index addressing as costing one addrec more than is correct so we don't pick them when we should. * If we pick PreIndexed or PostIndexed then we get the correct cost for that addressing type, but still get it wrong for the other and also exclude potential solutions using offset addressing that could have less cost. This patch adds an "all" addressing mode that causes all potential solutions to be generated and counts both pre and postindex as having AddRecCost of zero. Unfortuntely this reveals problems elsewhere in how we calculate the cost of things that need to be fixed before we can make use of it.	2025-09-16 11:46:54 +01:00
Kazu Hirata	8b8b0f197f	[Scalar] Remove an unnecessary cast (NFC) (#150474 ) getOperand() already returns Value *.	2025-07-24 15:50:00 -07:00
Nikita Popov	5f531827a4	[LSR] Do not consider uses in lifetime intrinsics (#149492 ) We should ignore uses of pointers in lifetime intrinsics, as these are not actually materialized in the final code, so don't affect register pressure or anything else LSR needs to model. Handling these only results in peculiar rewrites where additional intermediate GEPs are introduced.	2025-07-18 16:13:00 +02:00
Jeremy Morse	c9d8b68676	[DebugInfo] Suppress lots of users of DbgValueInst (#149476 ) This is another prune of dead code -- we never generate debug intrinsics nowadays, therefore there's no need for these codepaths to run. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-07-18 11:31:52 +01:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
John Brawn	f8c2c4f161	[LSR] Account for hardware loop instructions (#147958 ) A hardware loop instruction combines a subtract, compare with zero, and branch. We currently account for the compare and branch being combined into one in Cost::RateFormula, as part of more general handling for compare-branch-zero, but don't account for the subtract, leading to suboptimal decisions in some cases. Fix this in Cost::RateRegister by noticing when we have such a subtract and discounting the AddRecCost in such a case.	2025-07-14 16:48:54 +01:00
Shan Huang	089106fdfb	[DebugInfo][LoopStrengthReduce] Salvage the debug value of the dead cmp instruction (#147241 ) Fix #147238	2025-07-14 09:45:37 +08:00
Ramkumar Ramachandra	b7059ebafe	[LSR] Strip dead code (NFC) (#146109 ) Nested AddRec is already rejected by the handling in pushSCEV().	2025-07-03 13:37:08 +01:00
Ramkumar Ramachandra	04cd0f2702	[LSR] Clean up code using SCEVPatternMatch (NFC) (#145556 )	2025-06-28 11:41:53 +01:00
Jeremy Morse	9eb0020555	[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389 ) Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>	2025-06-17 15:55:14 +01:00
John Brawn	a54712c8ec	[LSR] Make canHoistIVInc allow non-integer types (#143707 ) canHoistIVInc was made to only allow integer types to avoid a crash in isIndexedLoadLegal/isIndexedStoreLegal due to them failing an assertion in getValueType (or rather in MVT::getVT which gets called from that) when passed a struct type. Adjusting these functions to pass AllowUnknown=true to getValueType means we don't get an assertion failure (MVT::Other is returned which TLI->isIndexedLoadLegal should then return false for), meaning we can remove this check for integer type.	2025-06-16 15:23:40 +01:00
Kazu Hirata	f3867f900f	[llvm] Use *Map::try_emplace (NFC) (#143321 ) - try_emplace(Key) is shorter than insert(std::make_pair(Key, 0)). - try_emplace performs value initialization without value parameters. - We overwrite values on successful insertion anyway.	2025-06-08 16:18:46 -07:00
Kazu Hirata	89308de4b0	[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522 ) try_emplace value-initializes values, so we do not need to pass nullptr to try_emplace when the value types are raw pointers or std::unique_ptr<T>.	2025-05-26 15:13:02 -07:00
Florian Hahn	bc0c4db5d9	[SCEV] Add dedicated AffineAddRec matcher + loop matchers (NFC). (#141141 ) Add dedicated m_scev_AffineAddRec matcher with complementing m_Loop() and m_SpecificLoop matchers. PR: https://github.com/llvm/llvm-project/pull/141141	2025-05-25 08:40:31 +01:00
Kazu Hirata	0ef8ef66cc	[Transforms] Remove unused includes (NFC) (#141357 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-24 09:37:43 -07:00
Kazu Hirata	fe6290ef5b	[llvm] Use *Map::try_emplace (NFC) (#140843 ) try_emplace can default-construct values, so we do not need to do so on our own. Plus, try_emplace(Key) is much shorter than insert(std::make_pair(Key, Value()).	2025-05-21 01:11:01 -07:00
Ramkumar Ramachandra	61d3ad963c	[SCEVPatternMatch] Introduce m_scev_AffineAddRec (#140377 ) Introduce m_scev_AffineAddRec to match affine AddRecs, a class_match for SCEVConstant, and demonstrate their utility in LSR and SCEV. While at it, rename m_Specific to m_scev_Specific for clarity.	2025-05-19 12:02:07 +01:00
Jon Chesterfield	9c60431b67	[NFC] Add a specialization of DenseMapInfo for SmallVector (#140380 ) Equivalent to the three existing uses I found which were all pointers. Implementing the general pattern so SmallVector<int> etc will work as well. Added to the SmallVector.h header as opposed to DenseMapInfo.h following the StringRef.h and SmallBitVector.h prior art. Noticed while writing an unrelated patch which currently wants a map from small vectors to other things and cleaner to generalise than add another specialisation to said patch.	2025-05-17 19:13:30 +01:00
Sergei Barannikov	cedeef6707	[LSR] Replace casts with an equivalent std::as_const (NFC) (#138980 ) The casts / `std::as_const` are used here to select `const` overload of `begin()`/`end()` so that the type of the returned iterator matches the type of `J`, which is `const_iterator`.	2025-05-08 13:36:37 +03:00
David Green	98b6f8dc69	[CostModel] Remove optional from InstructionCost::getValue() (#135596 ) InstructionCost is already an optional value, containing an Invalid state that can be checked with isValid(). There is little point in returning another optional from getValue(). Most uses do not make use of it being a std::optional, dereferencing the value directly (either isValid has been checked previously or the Cost is assumed to be valid). The one case that does in AMDGPU used value_or which has been replaced by a isValid() check.	2025-04-23 07:46:27 +01:00
Kazu Hirata	b01e25deba	[llvm] Call hash_combine_range with ranges (NFC) (#136511 )	2025-04-20 16:36:03 -07:00
Kazu Hirata	0dcc201ac4	[Transforms] Use *Set::insert_range (NFC) (#132056 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-19 15:35:01 -07:00
Jeremy Morse	34b139594a	[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283 ) To finalise the "RemoveDIs" work removing debug intrinsics, we're updating call sites that insert instructions to use iterators instead. This set of changes are those where it's not immediately obvious that just calling getIterator to fetch an iterator is correct, and one or two places where more than one line needs to change. Overall the same rule holds though: iterators generated for the start of a block such as getFirstNonPHIIt need to be passed into insert/move methods without being unwrapped/rewrapped, everything else can use getIterator.	2025-01-27 16:44:14 +00:00
Jeremy Morse	e14962a39c	[NFC][DebugInfo] Use iterators for instruction insertion in more places (#124291 ) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. This patch changes some more complex call-sites, those crossing file boundaries and where I've had to perform some minor rewrites.	2025-01-27 15:25:17 +00:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Piotr Fusik	1a44a53cd5	[LSR][NFC] Use range-based `for` (#113889 )	2024-11-05 07:11:23 +01:00
Kazu Hirata	94f9cbbe49	[Scalar] Remove unused includes (NFC) (#114645 ) Identified with misc-include-cleaner.	2024-11-02 08:32:26 -07:00
Youngsuk Kim	caa32e6d6f	[llvm][LSR] Fix where invariant on ScaledReg & Scale is violated (#112576 ) Comments attached to the `ScaledReg` field of `struct Formula` explains that, `ScaledReg` must be non-null when `Scale` is non-zero. This fixes up a code path where this invariant is violated. Also, add an assert to ensure this invariant holds true. Without this patch, compiler aborts with the attached test case. Fixes #76504	2024-10-17 10:47:44 -04:00
Orlando Cazalet-Hyams	7506872afc	[DebugInfo][LSR] Fix assertion failure salvaging IV with offset > 64 bits wide (#110979 ) Fixes #110494	2024-10-03 11:47:08 +01:00
Mehdi Amini	6c7a3f80e7	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110938 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if. Re-apply #110185 with more fixes for debug build with the ABI breaking checks disabled.	2024-10-03 01:24:14 +02:00
Sergey Kachkov	1f2a634c44	Reland "[LSR] Do not create duplicated PHI nodes while preserving LCSSA form" (#107380 ) Motivating example: https://godbolt.org/z/eb97zrxhx Here we have 2 induction variables in the loop: one is corresponding to i variable (add rdx, 4), the other - to res (add rax, 2). The second induction variable can be removed by rewriteLoopExitValues() method (final value of res at loop exit is unroll_iter * -2); however, this doesn't happen because we have duplicated LCSSA phi nodes at loop exit: ``` ; Preheader: for.body.preheader.new: ; preds = %for.body.preheader %unroll_iter = and i64 %N, -4 br label %for.body ; Loop: for.body: ; preds = %for.body, %for.body.preheader.new %lsr.iv = phi i64 [ %lsr.iv.next, %for.body ], [ 0, %for.body.preheader.new ] %i.07 = phi i64 [ 0, %for.body.preheader.new ], [ %inc.3, %for.body ] %inc.3 = add nuw i64 %i.07, 4 %lsr.iv.next = add nsw i64 %lsr.iv, -2 %niter.ncmp.3.not = icmp eq i64 %unroll_iter, %inc.3 br i1 %niter.ncmp.3.not, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop !7 ; Exit blocks for.end.loopexit.unr-lcssa.loopexit: ; preds = %for.body %inc.3.lcssa = phi i64 [ %inc.3, %for.body ] %lsr.iv.next.lcssa11 = phi i64 [ %lsr.iv.next, %for.body ] %lsr.iv.next.lcssa = phi i64 [ %lsr.iv.next, %for.body ] br label %for.end.loopexit.unr-lcssa ``` rewriteLoopExitValues requires %lsr.iv.next value to have only 2 uses: one in LCSSA phi node, the other - in induction phi node. Here we have 3 uses of this value because of duplicated lcssa nodes, so the transform doesn't apply and leads to an extra add operation inside the loop. The proposed solution is to accumulate inserted instructions that will require LCSSA form update into SetVector and then call formLCSSAForInstructions for this SetVector once, so the same instructions don't process twice. Reland fixes the issue with preserve-lcssa.ll test: it fails in the situation when x86_64-unknown-linux-gnu target is unavailable in opt. The changes are moved into separate duplicated-phis.ll test with explicit x86 target requirement to fix bots which are not building this target.	2024-09-09 16:14:51 +03:00
dyung	2bf551e600	Revert "[LSR] Do not create duplicated PHI nodes while preserving LCSSA form" (#107666 ) Reverts llvm/llvm-project#107380 Change is causing the test preserve-lcssa.ll to fail on at least 2 build bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/5231 - https://lab.llvm.org/buildbot/#/builders/161/builds/1855	2024-09-06 19:54:26 -07:00
Sergey Kachkov	2cb4d1b1bd	[LSR] Do not create duplicated PHI nodes while preserving LCSSA form (#107380 ) Motivating example: https://godbolt.org/z/eb97zrxhx Here we have 2 induction variables in the loop: one is corresponding to i variable (add rdx, 4), the other - to res (add rax, 2). The second induction variable can be removed by rewriteLoopExitValues() method (final value of res at loop exit is unroll_iter * -2); however, this doesn't happen because we have duplicated LCSSA phi nodes at loop exit: ``` ; Preheader: for.body.preheader.new: ; preds = %for.body.preheader %unroll_iter = and i64 %N, -4 br label %for.body ; Loop: for.body: ; preds = %for.body, %for.body.preheader.new %lsr.iv = phi i64 [ %lsr.iv.next, %for.body ], [ 0, %for.body.preheader.new ] %i.07 = phi i64 [ 0, %for.body.preheader.new ], [ %inc.3, %for.body ] %inc.3 = add nuw i64 %i.07, 4 %lsr.iv.next = add nsw i64 %lsr.iv, -2 %niter.ncmp.3.not = icmp eq i64 %unroll_iter, %inc.3 br i1 %niter.ncmp.3.not, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop !7 ; Exit blocks for.end.loopexit.unr-lcssa.loopexit: ; preds = %for.body %inc.3.lcssa = phi i64 [ %inc.3, %for.body ] %lsr.iv.next.lcssa11 = phi i64 [ %lsr.iv.next, %for.body ] %lsr.iv.next.lcssa = phi i64 [ %lsr.iv.next, %for.body ] br label %for.end.loopexit.unr-lcssa ``` rewriteLoopExitValues requires %lsr.iv.next value to have only 2 uses: one in LCSSA phi node, the other - in induction phi node. Here we have 3 uses of this value because of duplicated lcssa nodes, so the transform doesn't apply and leads to an extra add operation inside the loop. The proposed solution is to accumulate inserted instructions that will require LCSSA form update into SetVector and then call formLCSSAForInstructions for this SetVector once, so the same instructions don't process twice.	2024-09-06 18:39:47 +03:00
Nikita Popov	7660981402	[LSR] Use computeConstantDifference() This API is faster than getMinusSCEV() and a SCEVConstant cast.	2024-08-28 12:20:59 +02:00
Philip Reames	27a62ec72a	[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234 ) This transformation doesn't actually use any of the internal state of LSR and recomputes all information from SCEV. Splitting it out makes it easier to test. Note that long term I would like to write a version of this transform which is integrated with LSR's solver, but if that happens, we'll just delete the extra pass. Integration wise, I switched from using TTI to using a pass configuration variable. This seems slightly more idiomatic, and means we don't run the extra logic on any target other than RISCV.	2024-08-17 18:34:23 -07:00
Benjamin Maxwell	7fad04e94b	[LSR] Fix matching vscale immediates (#100080 ) Somewhat confusingly a `SCEVMulExpr` is a `SCEVNAryExpr`, so can have > 2 operands. Previously, the vscale immediate matching did not check the number of operands of the `SCEVMulExpr`, so would ignore any operands after the first two. This led to incorrect codegen (and results) for ArmSME in IREE (https://github.com/iree-org/iree), which sometimes addresses things that are a `vscale * vscale` multiple away. The test added with this change shows an example reduced from IREE. The second write should be offset from the first `16 * vscale * vscale` (* 4 bytes), however, previously LSR dropped the second vscale and instead offset the write by `#4, mul vl`, which is an offset of `16 * vscale` (* 4 bytes).	2024-07-24 10:06:34 +01:00
Shan Huang	d83d09facd	[DebugInfo][LoopStrengthReduce] Fix missing debug location updates (#97519 ) Fix #97510 . Note that, for the new phi instruction `NewPH`, which replaces the old phi `PH` and the cast `ShadowUse`, I choose to propagate the debug location of `PH` to it, because the cast is eliminated according to the optimization semantics.	2024-07-15 09:44:18 +08:00
Kazu Hirata	2f55e55101	[Transforms] Use range-based for loops (NFC) (#98725 )	2024-07-14 13:44:50 -07:00
Graham Hunter	4311b14e9c	[LSR] Recognize vscale-relative immediates (#88124 ) Extends LoopStrengthReduce to recognize immediates multiplied by vscale, and query the current target for whether they are legal offsets for memory operations or adds.	2024-07-01 09:23:31 +01:00

1 2 3 4 5 ...

1022 Commits