llvm-project

Author	SHA1	Message	Date
Ryotaro Kasuga	2330fd2f73	[LoopPeel] Add new option to peeling loops to convert PHI into IV (#121104 ) LoopPeel currently considers PHI nodes that become loop invariants through peeling. However, in some cases, peeling transforms PHI nodes into induction variables (IVs), potentially enabling further optimizations such as loop vectorization. For example: ```c // TSVC s292 int im = N-1; for (int i=0; i<N; i++) { a[i] = b[i] + b[im]; im = i; } ``` In this case, peeling one iteration converts `im` into an IV, allowing it to be handled by the loop vectorizer. This patch adds a new feature to peel loops when to convert PHIs into IVs. At the moment this feature is disabled by default. Enabling it allows to vectorize the above example. I have measured on neoverse-v2 and observed a speedup of more than 60% (options: `-O3 -ffast-math -mcpu=neoverse-v2 -mllvm -enable-peeling-for-iv`). This PR is taken over from #94900 Related #81851	2025-08-20 13:44:56 +00:00
Orlando Cazalet-Hyams	6c9352530a	[RemoveDIs][NFC] Clean up BasicBlockUtils now intrinsics are gone (#154326 ) A couple of minor readability changes now that we're not supporting both intrinsics and records.	2025-08-20 10:03:44 +01:00
Stephen Tozer	5cedb01487	[Debugify] Fix compile error in tracking coverage build Forward-fixes a compile error in bc216b057d (#150212) in specific build configurations, due to a missing const_cast.	2025-08-19 11:18:42 +01:00
David Green	a7df02f83c	[InstCombine] Make strlen optimization more resilient to different gep types. (#153623 ) This makes the optimization in optimizeStringLength for strlen(gep @glob, %x) -> sub endof@glob, %x a little more resilient, and maybe a bit more correct for geps with non-array types.	2025-08-19 10:37:17 +01:00
Andreas Jonson	1b60236200	[SimplifyCFG] Avoid redundant calls in gather. (NFC) (#154133 ) Split out from https://github.com/llvm/llvm-project/pull/154007 as it showed compile time improvements NFC as there needs to be at least two icmps that is part of the chain.	2025-08-18 18:45:52 +02:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Arne Stenkrona	ea2f5395b1	[SimplifyCFG] Avoid threading for loop headers (#151142 ) Updates SimplifyCFG to avoid jump threading through loop headers if -keep-loops is requested. Canonical loop form requires a loop header that dominates all blocks in the loop. If we thread through a header, we risk breaking its domination of the loop. This change avoids this issue by conservatively avoiding threading through headers entirely. Fixes: https://github.com/llvm/llvm-project/issues/151144	2025-08-18 09:46:55 +00:00
Kazu Hirata	cbf5af9668	[llvm] Remove unused includes (NFC) (#154051 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-08-17 23:46:35 -07:00
Andreas Jonson	5ae8a9b8ce	[SimplifyCfg] Handle trunc nuw i1 condition in Equality comparison. (#153051 ) proof: https://alive2.llvm.org/ce/z/WVt4-F	2025-08-17 09:53:40 +02:00
Matt Arsenault	3e5d8a1439	Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2. Check if llvm-nm exists before building the benchmark.	2025-08-16 09:53:50 +09:00
gulfemsavrun	334e9bf2dd	Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) …210)" This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3. Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)" This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de. Revert "TableGen: Emit statically generated hash table for runtime libcalls (#150192)" This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05. Reverted three changes because of a CMake error while building llvm-nm as reported in the following PR: https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073	2025-08-15 13:32:27 -07:00
zGoldthorpe	a8d25683ee	[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692 ) When matching integers, `m_ConstantInt` is a convenient alternative to `m_APInt` for matching unsigned 64-bit integers, allowing one to simplify ```cpp const APInt *IntC; if (match(V, m_APInt(IntC))) { if (IntC->ule(UINT64_MAX)) { uint64_t Int = IntC->getZExtValue(); // ... } } ``` to ```cpp uint64_t Int; if (match(V, m_ConstantInt(Int))) { // ... } ``` However, this simplification is only true if `V` is a scalar type. Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt` does not. This patch ensures that the matching behaviour of `m_ConstantInt` parallels that of `m_APInt`, and also incorporates it in some obvious places.	2025-08-15 10:43:54 -06:00
Stephen Tozer	bc216b057d	[Debugify] Improve reduction of debugify coverage build output (#150212 ) In current DebugLoc coverage builds, the output for any reasonably large build can become very large if any missing DebugLocs are present; this happens because single errors in LLVM may result in many errors being reported in the output report. The main cause of this is that the empty locations attached to instructions may be propagated to other instructions in later passes, which will each be reported as new errors. This patch prevents this by adding an "unknown" annotation to instructions after reporting them once, ensuring that any other DebugLocs copied or derived from the original empty location will not be marked as new errors. As a separate but related change, this patch updates the report generation script to deduplicate results using the recorded stacktrace if they are available, instead of the pass+instruction combination. This reduces the size of the reduction, but makes the reduction highly reliable, as the stacktrace allows us to very precisely identify when two bugs have originated from the same place.	2025-08-15 14:01:04 +01:00
Matt Arsenault	cb1228fbd5	RuntimeLibcalls: Return StringRef for libcall names (#153209 ) Does not yet fully propagate this down into the TargetLowering uses, many of which are relying on null checks on the returned value.	2025-08-15 09:55:39 +09:00
Orlando Cazalet-Hyams	d13341db26	[RemoveDIs][NFC] Remove getAssignmentMarkers (#153214 ) getAssignmentMarkers was for debug intrinsics. getDVRAssignmentMarkers is used for DbgRecords.	2025-08-13 10:56:19 +01:00
Andreas Jonson	1840106ddf	[SCCP] Add support for trunc nuw range. (#152990 ) proof: https://alive2.llvm.org/ce/z/_7PVxq	2025-08-12 13:48:55 +02:00
Nikita Popov	ab323eb0c6	[SCCP][PredicateInfo] Do not predicate argument of lifetime intrinsic Replacing the argument with a no-op bitcast violates a verifier constraint, even if only temporarily. Any replacement based on it would result in a violation even after the copy has been removed. Fixes https://github.com/llvm/llvm-project/issues/153013.	2025-08-12 12:56:08 +02:00
Sam Tebbs	0bfa1718af	[LV] Create in-loop sub reductions (#147026 ) This PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Stacked PRs: 1. -> https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-08-12 10:22:41 +01:00
Andreas Jonson	330a589450	[PredicateInfo] Handle trunc nuw i1 condition. (#152988 ) proof: https://alive2.llvm.org/ce/z/mxtn4L	2025-08-11 13:00:54 +02:00
hanbeom	a750fcb52b	[GVN] Check IndirectBr in Predecessor Terminators (#151188 ) Critical edges with an IndirectBr terminator cannot be split. Add a check it to prevent assertion failures. Fixes: #150229	2025-08-11 09:25:52 +02:00
Nikita Popov	35bad229c1	[PredicateInfo] Use bitcast instead of ssa.copy (#151174 ) PredicateInfo needs some no-op to which the predicate can be attached. Currently this is an ssa.copy intrinsic. This PR replaces it with a no-op bitcast. Using a bitcast is more efficient because we don't have the overhead of an overloaded intrinsic. It also makes things slightly simpler overall.	2025-08-11 09:25:01 +02:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Matt Arsenault	1110e2ff9f	InlineFunction: Split inlining into predicate and apply functions (#134213 ) This is to support a new inline function reduction in llvm-reduce, which should pre-filter callsites that are not eligible for inlining. This code was mostly structured as a match and apply, with a few exceptions. The ugliest piece is for propagating and verifying compatible getGC and personalities. Also collection of EHPad and the convergence token to use are now cached in InlineFunctionInfo. I was initially confused by the split between the checks performed here and isInlineViable, so better document how this system is supposed to work. It turns out this split does make sense, in that isInlineViable checks if it's possible based on the callee content and the ultimate inline depended on the callsite context. I think more renames of these functions would help, and isInlineViable should probably move out of InlineCost to be with these transfoms.	2025-08-07 16:13:36 +09:00
Mircea Trofin	f675483905	[profcheck] Annotate `select` instructions (#152171 ) For `select`, we don't have the equivalent of the branch probability analysis to offer defaults, so we make up our own and allow their overriding with flags. Issue #147390	2025-08-06 02:48:50 +02:00
Kazu Hirata	908ef45606	[Utils] Fix a warning This patch fixes: llvm/lib/Transforms/Utils/SplitModuleByCategory.cpp:321:14: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]	2025-08-05 07:24:10 -07:00
Maksim Sabianin	3f59a22711	[offload][SYCL] Add Module splitting by categories. (#131347 ) This patch adds Module splitting by categories. The splitting algorithm is the necessary step in the SYCL compilation pipeline. Also it could be reused for other heterogenous targets. The previous attempt was at #119713. In this patch there is no dependency in `TransformUtils` on "IPO" and on "Printing Passes". In this patch a module splitting is self-contained and it doesn't introduce linking issues.	2025-08-05 14:04:59 +00:00
Kazu Hirata	35dd88918f	[llvm] Use llvm::iterator_range::empty (NFC) (#151905 )	2025-08-04 07:40:46 -07:00
Andreas Jonson	c6fd3d32c3	[SimplifyCfg] Add nneg to zext for switch to table conversion (#147180 )	2025-08-04 16:18:05 +02:00
Nikita Popov	e833bb0991	[Local] Do not pass Root to replaceDominatedUsesWith (NFC) Capture it in the lambdas instead.	2025-08-04 14:22:17 +02:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Mircea Trofin	9a60841dc4	[PGO][profcheck] ignore explicitly cold functions (#151778 ) There is a case when branch profile metadata is OK to miss, namely, cold functions. The goal of the RFC (see the referenced issue) is to avoid accidental omission (and, at a later date, corruption) of profile metadata. However, asking cold functions to have all their conditional branches marked with "0" probabilities would be overdoing it. We can just ask cold functions to have an explicit 0 entry count. This patch: - injects an entry count for functions, unless they have one (synthetic or not) - if the entry count is 0, doesn't inject, nor does it verify the rest of the metadata - at verification, if the entry count is missing, it reports an error Issue #147390	2025-08-04 03:53:49 +02:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	a85c725952	Revert "[Utils] Fix a warning" This reverts commit 3a18fe33f0763cd9276c99c276448412100f6270. So that we can revert PR #148758.	2025-07-31 15:54:01 -04:00
Kazu Hirata	3a18fe33f0	[Utils] Fix a warning This patch fixes: llvm/lib/Transforms/Utils/LoopUtils.cpp:818:28: error: unused function 'operator<<' [-Werror,-Wunused-function]	2025-07-31 11:24:33 -07:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
Florian Hahn	99d70e09a9	[SCEV] Allow adds of constants in tryToReuseLCSSAPhi. (#150693 ) Update the logic added in https://github.com/llvm/llvm-project/pull/147824 to also allow adds of constants. There are a number of cases where this can help remove redundant phis and replace some computation with a ptrtoint (which likely is free in the backend). PR: https://github.com/llvm/llvm-project/pull/150693	2025-07-31 16:33:25 +01:00
LU-JOHN	a757f23404	[SimplifyCFG] Extend jump-threading to allow live local defs (#135079 ) Extend jump-threading to allow local defs that are live outside of the threaded block. Allow threading to destinations where the local defs are not live. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-07-31 09:44:14 -04:00
Nikita Popov	fa6965f722	[SCCP] Extract PredicateInfo handling into separate method (NFC)	2025-07-29 16:36:33 +02:00
Ellis Hoag	819f020b28	Use F.hasOptSize() instead of checking optsize directly (#147348 )	2025-07-28 08:38:52 -07:00
Florian Hahn	f9f68af4b8	[SCEV] Make sure LCSSA is preserved when re-using phi if needed. If we insert a new add instruction, it may introduce a new use outside the loop that contains the phi node we re-use. Use fixupLCSSAFormFor to fix LCSSA form, if needed. This fixes a crash reported in https://github.com/llvm/llvm-project/pull/147824#issuecomment-3124670997.	2025-07-28 16:24:46 +01:00
Florian Hahn	e21ee41be4	[SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. (#147824 ) Generalize the code added in https://github.com/llvm/llvm-project/pull/147214 to also support re-using pointer LCSSA phis when expanding SCEVs with AddRecs. A common source of integer AddRecs with pointer bases are runtime checks emitted by LV based on the distance between 2 pointer AddRecs. This improves codegen in some cases when vectorizing and prevents regressions with https://github.com/llvm/llvm-project/pull/142309, which turns some phis into single-entry ones, which SCEV will look through now (and expand the whole AddRec), whereas before it would have to treat the LCSSA phi as SCEVUnknown. Compile-time impact neutral: https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u PR: https://github.com/llvm/llvm-project/pull/147824	2025-07-25 15:29:40 +01:00
Kazu Hirata	3e53d4d386	[llvm] Remove unused includes (NFC) (#150265 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-23 15:18:46 -07:00
Mircea Trofin	df2d2d125b	[PGO] Add ProfileInjector and ProfileVerifier passes (#147388 ) Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`) Tracking issue: #147390	2025-07-23 21:34:58 +02:00
Nikita Popov	bdd638a897	[Local] Remove handling for lifetime intrinsic on non-alloca (NFC) After #149310 this is guaranteed to be an alloca.	2025-07-23 14:21:22 +02:00
Nikita Popov	b59aaf7da7	[Sanitizers] Remove handling for lifetimes on non-alloca insts (NFC) (#149994 ) After #149310 the pointer argument of lifetime.start/lifetime.end is guaranteed to be an alloca, so we don't need to go through findAllocaForValue() anymore, and don't have to have special handling for the case where it fails.	2025-07-23 09:48:32 +02:00
Nikita Popov	307256ecbd	[GVNSink] Do not sink lifetimes of different allocas (#149818 ) This was always undesirable, and after #149310 it is illegal and will result in a verifier error. Fix this by moving SimplifyCFG's check for this into canReplaceOperandWithVariable(), so it's shared with GVNSink.	2025-07-22 09:44:03 +02:00
Jeremy Morse	c9ceb9b75f	[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816 ) This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.	2025-07-21 17:49:25 +01:00
Yingwei Zheng	9e587ce6f0	[SCCP] Simplify [us]cmp(X, Y) into X - Y (#144717 ) If the difference between [us]cmp's operands is not greater than 1, we can simplify it into `X - Y`. Alive2: https://alive2.llvm.org/ce/z/JS55so llvm-opt-benchmark diff: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2464/files	2025-07-20 15:01:44 +08:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Florian Hahn	004c67ea25	[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239 ) Update LV to vectorize maxnum/minnum reductions without fast-math flags, by adding an extra check in the loop if any inputs to maxnum/minnum are NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros are already handled consistently by maxnum/minnum. If any input is NaN, exit the vector loop, compute the reduction result up to the vector iteration that contained NaN inputs and * resume in the scalar loop New recurrence kinds are added for reductions using maxnum/minnum without fast-math flags. PR: https://github.com/llvm/llvm-project/pull/148239	2025-07-18 21:58:19 +01:00

1 2 3 4 5 ...

7991 Commits