llvm-project

Author	SHA1	Message	Date
Kazu Hirata	8210cdd764	[llvm] Use llvm::replace (NFC) (#137481 )	2025-04-26 18:18:09 -07:00
Kazu Hirata	8ba3a232d1	[llvm] Use llvm::copy (NFC) (#137470 )	2025-04-26 15:50:38 -07:00
Florian Hahn	826f237cb4	[VPlan] Don't added separate vector latch block (NFC). Simplify initial VPlan construction by not creating a separate vector.latch block, which isn't needed and will get folded away later. This has been suggested as independent clean-up multiple times.	2025-04-26 22:03:18 +01:00
sallto	419a2cb218	[Inliner] Preserve alignment of byval arguments (#137455 ) Previously the inliner always produced a memcpy with alignment 1 for src and destination, leading to potentially suboptimal Codegen. Since the Src ptr alignment is only available through the CallBase it has to be passed to HandleByValArgumentInit. Dst Alignment is already known so it doesn't have to be passed along. If there is no specified Src Alignment my changes cause the ptr to have no align data attached instead of align 1 as before (see inline-tail.ll), I believe this is fine but since I'm a first time contributor, please confirm. My changes are already covered by 4 existing regression tests, so I did not add any additional ones. The example from #45778 now results in: ```C opt -S -passes=inline,instcombine,sroa,instcombine test.ll define dso_local i32 @test(ptr %t) { entry: %.sroa.0.0.copyload = load ptr, ptr %t, align 8 # this used to be align 1 in the original issue %arrayidx.i = getelementptr inbounds nuw i8, ptr %.sroa.0.0.copyload, i64 24 %0 = load i32, ptr %arrayidx.i, align 4 ret i32 %0 } ``` Fixes #45778.	2025-04-26 21:38:58 +02:00
Abid Qadeer	58430692fc	[CodeExtractor] Improve debug info for input values. (#136016 ) If we use `CodeExtractor` to extract the block1 into a new function, ``` define void @foo() !dbg !2 { entry: %1 = alloca i32, i64 1, align 4 %2 = alloca i32, i64 1, align 4 #dbg_declare(ptr %1, !8, !DIExpression(), !1) br label %block1 block1: store i32 1, ptr %1, align 4 store i32 2, ptr %2, align 4 #dbg_declare(ptr %2, !10, !DIExpression(), !1) ret void } ``` it will look like the extracted function shown below (with some irrelevent details removed). ``` define internal void @extracted(ptr %arg0, ptr %arg1) { newFuncRoot: br label %block1 block1: store i32 1, ptr %arg0, align 4 store i32 2, ptr %arg1, align 4 ret void } ``` You will notice that it has replaced the usage of values that were in the parent function (%1 and %2) with the arguments to the new function. But it did not do the same thing with `#dbg_declare` which was simply dropped because its location pointed to a value outside of the new function. Similarly arg0 is without any debug record, although the value that it replaced had one and we could materialize one for it based on that. This is not just a theoretical limitations. `CodeExtractor` is used to create functions that implement many of the `OpenMP` constructs in `OMPIRBuilder`. As a result of these limitations, the debug information is missing from the created functions. This PR tries to address this problem. It iterates over the input to the extracted function and looks at their debug uses. If they were present in the new function, it updates their location. Otherwise it materialize a similar usage in the new function. Most of these changes are localized in `fixupDebugInfoPostExtraction`. Only other change is to propagate function inputs and the replacement values to it. --------- Co-authored-by: Tim Gymnich <tim@gymni.ch> Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>	2025-04-26 10:12:44 +01:00
LU-JOHN	571e024d00	[Sink][NFC] Move all checks for unsafe instructions into one function (#137398 ) Move check for instruction that is unsafe to sink into isSafeToMove function. Signed-off-by: John Lu <John.Lu@amd.com>	2025-04-26 10:10:27 +02:00
Yingwei Zheng	3e1e4062e1	[InstCombine] Preserve signbit semantics of NaN with fold to fabs (#136648 ) As per the LangRef and IEEE 754-2008 standard, the sign bit of NaN is preserved if there is no floating-point operation being performed. See also `862e35e25a` for reference. Alive2: https://alive2.llvm.org/ce/z/QYtEGj Closes https://github.com/llvm/llvm-project/issues/136646	2025-04-26 14:03:12 +08:00
Jim Lin	12d1cb1347	[InstCombine] Preserve disjoint or after folding casted bitwise logic (#136815 ) Optimize `or disjoint (zext/sext a) (zext/sext b))` to `(zext/sext (or disjoint a, b))` without losing disjoint. Confirmed here: https://alive2.llvm.org/ce/z/kQ5fJv.	2025-04-26 12:35:04 +08:00
LU-JOHN	f9d4e7ef8b	[NFC][Sink] Change runtime checks to asserts (#137354 ) Candidate block for sinking must be dominated by current location. This is true based on how the candidate block was selected. Runtime checks are not necessary and has been changed to an assertion. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-04-25 23:21:11 +02:00
Florian Hahn	c4d84e1b00	[VPlan] Use replaceSuccessor/replacePredecessor in insertBlock (NFC). Use replaceSuccessor/replacePredecessor in insertBlockAfter/insertBlockBefore. This preserves the predecessor order, which in turns is needed to not invalidate existing phi recipes. At the moment this is NFC, but enables additional uses in the future.	2025-04-25 20:46:10 +01:00
Matt Arsenault	91865ac9ba	Use isa instead of !dyn_cast (#137344 )	2025-04-25 19:11:56 +02:00
Matt Arsenault	559a50c5f0	SimplifyIndVar: Use use_empty instead of hasNUses(0) (#137346 )	2025-04-25 19:11:39 +02:00
Matt Arsenault	a214084ae4	BypassSlowDivision: Use use_empty instead of hasNUses(0) (#137345 )	2025-04-25 19:01:45 +02:00
Matt Arsenault	cf766f5210	InlineFunction: Use use_empty instead of hasNUses(0) (#137347 )	2025-04-25 19:01:20 +02:00
Florian Hahn	df21288247	[VPlan] Replace ExtractFromEnd with Extract(Last\|Penultimate)Element (NFC). (#137030 ) ExtractFromEnd only has 2 uses, extracting the last and penultimate elements. Replace it with 2 separate opcodes, removing the need to materialize and handle a constant argument. PR: https://github.com/llvm/llvm-project/pull/137030	2025-04-25 16:27:29 +01:00
Matt Arsenault	4ea2278e39	SLPVectorizer: Use use_empty instead of hasNUses(0) (#137336 )	2025-04-25 17:27:01 +02:00
Matt Arsenault	bdc523f31f	LowerMatrixIntrinsics: Use use_empty instead of hasNUses(0) (#137334 )	2025-04-25 17:21:40 +02:00
Jim Lin	462bf4746f	[InstCombine] Refactor the code for folding logicop and sext/zext. NFC. (#137132 ) This refactoring is for more easily adding the code to preserve disjoint or in the PR https://github.com/llvm/llvm-project/pull/136815. Both casts must have one use for folding logicop and sext/zext when the src type differ to avoid creating an extra instruction. If the src type of casts are the same, only one of the casts needs to have one use. This PR also adds more tests for the same src type.	2025-04-25 10:59:01 +08:00
Stephen Tozer	fdbf073a86	Revert "[DLCov] Implement DebugLoc coverage tracking (#107279 )" This reverts commit a9d93ecf1f8d2cfe3f77851e0df179b386cff353. Reverted due to the commit including a config in LLVM headers that is not available outside of the llvm source tree.	2025-04-25 00:36:28 +01:00
Matt Arsenault	37b135cc8f	Attributor: Don't rely on use_empty for constants (#137218 ) This allows inferring noalias on a null argument parameter. This avoids a non-NFC diff in a future change.	2025-04-24 21:41:55 +02:00
Jeffrey Byrnes	1636f4af7b	[CmpInstAnalysis] Decompose icmp eq (and x, C) C2 (#136367 ) This type of decomposition is used in multiple places already. Adding it to `CmpInstAnalysis` reduces code duplication.	2025-04-24 12:40:26 -07:00
Florian Hahn	7cce38beea	[VPlan] Remove dead SE argument from handleUncountableEarlyExit (NFC). ScalarEvolution is not used by the function, remove the dead arg.	2025-04-24 19:59:05 +01:00
Stephen Tozer	a9d93ecf1f	[DLCov] Implement DebugLoc coverage tracking (#107279 ) This is part of a series of patches that tries to improve DILocation bug detection in Debugify; see the review for more details. This is the patch that adds the main feature, adding a set of `DebugLoc::get<Kind>` functions that can be used for instructions with intentionally empty DebugLocs to prevent Debugify from treating them as bugs, removing the currently-pervasive false positives and allowing us to use Debugify (in its original DI preservation mode) to reliably detect existing bugs and regressions. This patch does not add uses of these functions, except for once in Clang before optimizations, and in `Instruction::dropLocation()`, since that is an obvious case that immediately removes a set of false positives.	2025-04-24 19:41:25 +01:00
Alexey Bataev	a7a74b349d	[SLP]Improve reordering of the alternate nodes Better to preserve the original order of the alternate nodes to avoid inter-lane shuffling, select/insert subvector patterns provide better perf. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/136329	2025-04-24 14:33:10 -04:00
Alexey Bataev	f427890a1d	[SLP]Fix PHI comparator to make it follow weak strict ordering restriction Fixes #137164	2025-04-24 11:08:17 -07:00
Stephen Tozer	d6bb786705	[DebugInfo] Propagate source loc from invoke to replacement branch (#137206 ) An existing transformation replaces invoke instructions with a call to the invoked function and a branch to the destination; when this happens, we propagate the invoke's source location to the call but not to the branch. This patch updates this behaviour to propagate to the branch as well. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-24 18:59:29 +01:00
Simon Pilgrim	f572a5951a	[VectorCombine] Ensure canScalarizeAccess handles cases where the index type can't represent all inbounds values Fixes #132563	2025-04-24 14:17:55 +01:00
Nikita Popov	57530c23a5	[GlobalOpt] Do not promote malloc if there are atomic loads/stores (#137158 ) When converting a malloc stored to a global into a global, we will introduce an i1 flag to track whether the global has been initialized. In case of atomic loads/stores, this will result in verifier failures, because atomic ops on i1 are illegal. Even if we changed this to i8, I don't think it is a good idea to change atomic types in that way. Instead, bail out of the transform is we encounter any atomic loads/stores of the global. Fixes https://github.com/llvm/llvm-project/issues/137152.	2025-04-24 15:15:47 +02:00
Stephen Tozer	224cd50e00	[DebugInfo][GlobalOpt] Preserve source locs for optimized loads (#134828 ) Some optimizations in globalopt simplify uses of a global value to uses of a generated global bool value; in some cases where this happens, the newly-generated instructions would not have the original source location(s) of the instructions they replaced propagated to them; this patch properly preserves those source locations. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-24 14:09:53 +01:00
Florian Hahn	06d4876982	[VPlan] Replace checking IR loop with checking VPlan predecessors (NFC). Update check to use VPEarlyExitBlock's predecessors, which removes a dependence on underlying IR and is more in line with the comment below.	2025-04-24 12:29:34 +01:00
Florian Hahn	5d136f90a9	[VPlan] Manage instruction metadata in VPlan. (#135272 ) Add a new helper to manage IR metadata that can be progated to generated instructions for recipes. This helps to remove a number of remaining uses of getUnderlyingInstr during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/135272	2025-04-24 11:57:19 +01:00
Camsyn	59b26abbbe	[TSan, SanitizerBinaryMetadata] Analyze the capture status for `alloca` rather than arbitrary `Addr` (#132756 ) This PR is based on my last PR #132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue #132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void Thread(void a) { ((int)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```	2025-04-24 10:48:07 +02:00
Luke Lau	3883b27ba8	[VPlan] Fix typo in assertion. NFC (#137009 )	2025-04-24 16:36:32 +08:00
Florian Hahn	e268f71c59	[VPlan] Remove unneeded early continue. (NFC) As suggested in https://github.com/llvm/llvm-project/pull/136455, now unreachable exit blocks won't have any phi nodes.	2025-04-24 08:59:30 +01:00
Florian Hahn	15bb1db4a9	[VPlan] Remove ILV::sinkScalarOperands. (#136023 ) Remove legacy ILV sinkScalarOperands, which is superseded by the sinkScalarOperands VPlan transforms. There are a few cases that aren't handled by VPlan's sinkScalarOperands, because the recipes doesn't support replicating. Those are pointer inductions and blends. We could probably improve this further, by allowing replication for more recipes, but I don't think the extra complexity is warranted. Depends on https://github.com/llvm/llvm-project/pull/136021. PR: https://github.com/llvm/llvm-project/pull/136023	2025-04-24 08:37:49 +01:00
Kazu Hirata	cb96a3dc07	[memprof] Dump the number of matched frames (#137082 ) This patch teaches readMemprof to dump the number of frames for each allocation site match. This information helps us analyze what part of the call stack in the MemProf profile has matched the IR. Aside from updating existing test cases, this patch adds one more test case, memprof-dump-matched-alloc-site.ll, because none of the existing test cases has the number of frames greater than one.	2025-04-23 21:29:16 -07:00
Arthur Eubanks	0547e84181	[FunctionAttrs] Bail if initializes range overflows 64-bit signed int (#137053 ) Otherwise the range doesn't make sense since we interpret it as signed. Fixes #134115	2025-04-23 15:56:24 -07:00
Florian Hahn	71f2c1e204	[VPlan] Use early exit in ::extractLastLaneOfFirstOperand (NFC). Reduce indent level, as suggested in https://github.com/llvm/llvm-project/pull/136455.	2025-04-23 21:55:35 +01:00
Florian Hahn	ff36508d21	[VPlan] Remove redundant setting of parent in createLoopRegion (NFC). The regions parents will be set when the parents are set after creating the parent region.	2025-04-23 21:45:15 +01:00
Florian Hahn	3fbbe9b8d0	[VPlan] Add exit phi operands during initial construction (NFC). (#136455 ) Add incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction. PR: https://github.com/llvm/llvm-project/pull/136455	2025-04-23 20:40:42 +01:00
Ramkumar Ramachandra	bdf21ca8ac	[LV] Fix missing entry in willGenerateVectors (#136712 ) willGenerateVectors switches on opcodes of a recipe, but Histogram is missing in the switch statement, which could cause a crash in some cases. The crash was initially observed when developing another patch.	2025-04-23 19:06:38 +01:00
Yingwei Zheng	8abc917fe0	[InstCombine] Do not fold logical is_finite test (#136851 ) This patch disables the fold for logical is_finite test (i.e., `and (fcmp ord x, 0), (fcmp u* x, inf) -> fcmp o* x, inf`). It is still possible to allow this fold for several logical cases (e.g., `stripSignOnlyFPOps(RHS0)` does not strip any operations). Since this patch has no real-world impact, I decided to disable this fold for all logical cases. Alive2: https://alive2.llvm.org/ce/z/aH4LC7 Closes https://github.com/llvm/llvm-project/issues/136650.	2025-04-24 00:12:30 +08:00
Nikita Popov	eea1efed30	[InstrProfiling] Avoid unnecessary bitcast (NFC) Not needed with opaque pointers.	2025-04-23 15:29:49 +02:00
Nikita Popov	208257f7e0	[CoroElide] Remove unnecessary bitcast (NFCI) No longer needed with opaque pointers.	2025-04-23 15:21:52 +02:00
Nikita Popov	01ee03c262	[CoroElide] Avoid AA query on non-pointers (NFCI)	2025-04-23 15:21:52 +02:00
Nikita Popov	14dee0aeaa	[NewGVN] Avoid AA query on non-pointers (NFCI) In order for the instruction result to alias with the pointer it needs to be a pointer.	2025-04-23 15:21:52 +02:00
Nikita Popov	91e1922d45	[DSE] Skip non-pointer args in initializes handling (NFCI) Avoid performing AA queries on non-pointers.	2025-04-23 15:21:52 +02:00
Nicholas Guy	1ce709cb84	[LV] Fix crash when building partial reductions using types that aren't known scale factors (#136680 )	2025-04-23 13:19:18 +01:00
Björn Pettersson	2a9f77f6bd	[Reassociate] Invalidate analysis passes after canonicalizeOperands (#136835 ) When ranking operands for an expression tree the reassociate pass also perform canonicalization, putting constants on the right hand side. Such transforms was however not registered as modifying the IR. So at the end of the pass, if not having made any other changes, the pass returned that all analyses should be kept. With this patch we make sure to set MadeChange to true when modifying the IR via canonicalizeOperands. This is to make sure analyses such as DemandedBits are properly invalidated when instructions are modified.	2025-04-23 12:52:00 +02:00
Fabian Ritter	720a91183b	[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW (#130617 ) If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. We can also preserve inbounds if the inbounds GEP and the involved additions are NUW. For SWDEV-516125.	2025-04-23 12:38:41 +02:00

1 2 3 4 5 ...

39630 Commits