llvm-project

Author	SHA1	Message	Date
Mingjie Xu	227edfb2f4	[CodeGenPrepare][NFC] Reland: Update the dominator tree instead of rebuilding it (#179040 ) The original differential revision is https://reviews.llvm.org/D153638 Reverted in `f5b5a30858` because of causing a clang crash. This patch relands it with the crash fixed. Call `DTU->flush()` in each iteration of `while (MadeChange)` loop, flush all awaiting BasicBlocks deletion, and prevent iterator invalidation.	2026-03-31 09:01:11 +08:00
nataliakokoromyti	fa0071baab	[CodeGenPrepare] Fix infinite loop with same-type bitcasts (#176694 ) OptimizeNoopCopyExpression was sinking same-type bitcasts (e.g. bitcast i32 to i32) which would then be reintroduced by optimizePhiType, causing an infinite loop. Fix by adding a check (PhiTy == ConvertTy) in optimizePhiType to skip the conversion when types are already identical. Fixes #176688.	2026-01-22 09:29:08 +01:00
Aiden Grossman	e2d7cd685d	[IR] Make dead_on_return attribute optionally sized This patch makes the dead_on_return parameter attribute optionally require a number of bytes to be passed in to specify the number of bytes known to be dead upon function return/unwind. This is aimed at enabling annotating the this pointer in C++ destructors with dead_on_return in clang. We need this to handle cases like the following: ``` struct X { int n; ~X() { this[n].n = 0; } }; void f() { X xs[] = {42, -1}; } ``` Where we only certain that sizeof(X) bytes are dead upon return of ~X. Otherwise DSE would be able to eliminate the store in ~X which would not be correct. This patch only does the wiring within IR. Future patches will make clang emit correct sizing information and update DSE to only delete stores to objects marked dead_on_return that are provably in bounds of the number of bytes specified to be dead_on_return. Reviewers: nikic, alinas, antoniofrighetto Pull Request: https://github.com/llvm/llvm-project/pull/171712	2026-01-21 08:22:05 -08:00
David Green	a4975a8089	[CGP][AArch64] Do not sink instructions that might read/write memory. (#176182 ) The test case's call instruction was being sank past the point where the memory it accessed was valid. Add a check that CGP does not try to sink instruction that might be invalid to move. Fixes #176095	2026-01-18 14:18:25 +08:00
Nikita Popov	7f6afc499f	[CGP] Use getSigned() for scale during address sinking The scale is a signed quantity. This avoids an assertion failure with github.com/llvm/llvm-project/pull/171456.	2026-01-06 17:23:53 +01:00
willmafh	2eb8ee137f	[NFC] Delete unnecessary apostrophe at the end of its (#173974 )	2026-01-04 20:02:40 +08:00
Mingjie Xu	fac9472593	[IR] Reland Optimize PHINode::removeIncomingValue() and PHINode::removeIncomingValueIf() to use the swapping strategy. (#174274 ) Reland #171963, #172639 and #173444, they are reverted in 86b9f90b9574b3a7d15d28a91f6316459dcfa046 because of introducing non-determinism in compiles. The non-determinism has been fixed in 9b8addffa70cee5b2acc5454712d9cf78ce45710.	2026-01-04 09:24:53 +08:00
Walter Lee	86b9f90b95	Revert 159f1c048e08a8780d92858cfc80e723c90235e3 (#173893 ) This causes non-determinism in compiles. From nikic: "FYI the non-determinism is also visible on llvm-opt-benchmark. Maybe repeatedly running test cases from `299446d99f` could reproduce the issue..." Also revert dependent 796fafeff92fe5d2d20594859e92607116e30a16 and e135447bda617125688b71d33480d131d1076a72.	2025-12-29 20:23:13 -05:00
Mingjie Xu	159f1c048e	[IR] Optimize PHINode::removeIncomingValue() by swapping removed incoming value with the last incoming value. (#171963 ) Current implementation uses `std::copy` to shift all incoming values after the removed index. This patch optimizes `PHINode::removeIncomingValue()` by replacing the linear shift of incoming values with a swap-with-last strategy. After this change, the relative order of incoming values after removal is not preserved. This improves compile-time for PHI nodes with many predecessors. Depends: https://github.com/llvm/llvm-project/pull/171955 https://github.com/llvm/llvm-project/pull/171956 https://github.com/llvm/llvm-project/pull/171960 https://github.com/llvm/llvm-project/pull/171962	2025-12-17 19:44:01 +08:00
Yingwei Zheng	59e601a3d5	[CodeGenPrepare] Don't simplify incomplete expression tree in AddrModeCombine (#164628 ) Since new select/phi instructions may construct loops, the expression tree to be simplified may still be incomplete (i.e., it may contain select with dummy values or phi without incoming values). This patch removes the call to simplifyInstruction for now, as it doesn't break existing tests. Original PR: https://reviews.llvm.org/D36073 Fix the crash reported in https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732.	2025-10-25 16:47:32 +08:00
paperchalice	3c2dae6919	[test][Transforms] Remove unsafe-fp-math uses part 1 (NFC) (#164742 ) Post cleanup for #164534.	2025-10-23 11:39:11 +08:00
Nikita Popov	573ca36753	[IR] Replace alignment argument with attribute on masked intrinsics (#163802 ) The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter` intrinsics currently accept a separate alignment immarg. Replace this with an `align` attribute on the pointer / vector of pointers argument. This is the standard representation for alignment information on intrinsics, and is already used by all other memory intrinsics. This means the signatures now match llvm.expandload, llvm.vp.load, etc. (Things like llvm.memcpy used to have a separate alignment argument as well, but were already migrated a long time ago.) It's worth noting that the masked.gather and masked.scatter intrinsics previously accepted a zero alignment to indicate the ABI type alignment of the element type. This special case is gone now: If the align attribute is omitted, the implied alignment is 1, as usual. If ABI alignment is desired, it needs to be explicitly emitted (which the IRBuilder API already requires anyway).	2025-10-20 08:50:09 +00:00
Vladimir Radosavljevic	be7f85168d	[CGP] Fix missing sign extension for base offset in optimizeMemoryInst (#161377 ) If we have integers larger than 64-bit we need to explicitly sign extend them, otherwise we will get wrong zero extended values.	2025-10-10 10:52:52 +00:00
Craig Topper	4be1099607	[RISCV] Improve fixed vector handling in isCtpopFast. (#158380 ) Previously we considered fixed vectors fast if Zvbb or Zbb is enabled. Zbb only helps if the vector type will end up being scalarized.	2025-09-16 09:47:09 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Paul Walker	be4a739a7f	[LLVM][CDP] Move AArch64 test into AArch64 directory.	2025-08-05 11:04:59 +00:00
Paul Walker	94d374ab6c	[LLVM][CGP] Allow finer control for sinking compares. (#151366 ) Compare sinking is selectable based on the result of hasMultipleConditionRegisters. This function is too coarse grained by not taking into account the differences between scalar and vector compares. This PR extends the interface to take an EVT to allow finer control. The new interface is used by AArch64 to disable sinking of scalable vector compares, but with isProfitableToSinkOperands updated to maintain the cases that are specifically tested.	2025-08-05 11:43:41 +01:00
Yingwei Zheng	2d0ca09305	[CodeGenPrepare] Make sure that `AddOffset` is also a loop invariant (#150625 ) Closes https://github.com/llvm/llvm-project/issues/150611.	2025-07-26 00:23:56 +08:00
Philip Reames	48ef55ce3e	[CGP] Update tests to use autogen scripts, and refresh check lines Reducing manual update work required for an upcoming change.	2025-07-03 11:36:33 -07:00
Evgenii Kudriashov	5ffdd9480d	[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization (#143566 ) Follow up on #139303	2025-06-28 23:30:03 +02:00
Philip Reames	0ef27186c9	[tests] Additional coverage for gather/scatter address optimizations	2025-06-26 11:50:57 -07:00
Florian Hahn	dde30a4731	[CGP] Bail out if (Base\|Scaled)Reg does not dominate insert point. (#142949 ) (Base\|Scaled)Reg may not dominate the chosen insert point, if there are multiple uses of the address. Bail out if that's the case, otherwise we will generate invalid IR. In some cases, we could probably adjust the insert point or hoist the (Base\|Scaled)Reg. Fixes https://github.com/llvm/llvm-project/issues/142830. PR: https://github.com/llvm/llvm-project/pull/142949	2025-06-06 12:38:30 +01:00
weiguozhi	59c6d70ed8	[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst (#139303 ) Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to find appropriate insert position for the sunk instruction. Fixes #138208.	2025-05-15 09:27:25 -07:00
Orlando Cazalet-Hyams	60d0bc1fae	Propagate DebugLocs on phis in BreakCriticalEdges (#133492 ) The pull request discusses whether this change is needed or not. We leant towards "it can't hurt" on the basis that it's at worst slightly unecessary (but not incorret). The motivation for the patch came from reviewing code duplication sites to update for Key Instructions, finding this, trying to generate a test case and seeing the DebugLocs aren't propagated.	2025-05-08 13:01:48 +01:00
Sergei Barannikov	5080a0251f	[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731 ) DAG combiner already does this transformation, but in some cases it does not have a chance because either CodeGenPrepare or SelectionDAGBuilder move icmp to a different basic block. https://alive2.llvm.org/ce/z/ARzh99 Fixes #94829 Pull Request: https://github.com/llvm/llvm-project/pull/102731	2025-04-23 08:54:10 +03:00
Nikita Popov	20507a9e95	[Verifier][CGP] Allow integer argument to dbg_declare (#134803 ) Relaxes the newly added verifier rule to also allow an integer argument to dbg_declare, which is interpreted as a pointer. Adjust CGP to deal with it gracefully. Fixes https://github.com/llvm/llvm-project/issues/134523. Alternative to https://github.com/llvm/llvm-project/pull/134601.	2025-04-10 12:29:56 +02:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Mingming Liu	5399782508	[IR] Generalize Function's {set,get}SectionPrefix to GlobalObjects, the base class of {Function, GlobalVariable, IFunc} (#125757 ) This is a split of https://github.com/llvm/llvm-project/pull/125756	2025-02-06 14:51:13 -08:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Stephen Tozer	822f74a911	[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767 ) This patch contains a number of changes relating to the above flag; primarily it updates comment references to the old flag names, "-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names, "-fextend-variable-liveness[={all,this}]". These changes are all NFC. This patch also removes the explicit -fextend-this-ptr-liveness flag alias, and shortens the help-text for the main flag; these are both changes that were meant to be applied in the initial PR (#110000), but due to some user-error on my part they were not included in the merged commit.	2025-01-28 18:25:32 +00:00
David Sherwood	346185c42c	[AArch64] Improve codegen of vectorised early exit loops (#119534 ) Once PR #112138 lands we are able to start vectorising more loops that have uncountable early exits. The typical loop structure looks like this: vector.body: ... %pred = icmp eq <2 x ptr> %wide.load, %broadcast.splat ... %or.reduc = tail call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> %pred) %iv.cmp = icmp eq i64 %index.next, 4 %exit.cond = or i1 %or.reduc, %iv.cmp br i1 %exit.cond, label %middle.split, label %vector.body middle.split: br i1 %or.reduc, label %found, label %notfound found: ret i64 1 notfound: ret i64 0 The problem with this is that %or.reduc is kept live after the loop, and since this is a boolean it typically requires making a copy of the condition code register. For AArch64 this requires an additional cset instruction, which is quite expensive for a typical find loop that only contains 6 or 7 instructions. This patch attempts to improve the codegen by sinking the reduction out of the loop to the location of it's user. It's a lot cheaper to keep the predicate alive if the type is legal and has lots of registers for it. There is a potential downside in that a little more work is required after the loop, but I believe this is worth it since we are likely to spend most of our time in the loop.	2025-01-06 13:17:14 +00:00
Yingwei Zheng	6568ceb9fa	[CodeGenPrepare] Drop nsw flags in `optimizeLoadExt` (#118180 ) Alive2: https://alive2.llvm.org/ce/z/pMcD7q Closes https://github.com/llvm/llvm-project/issues/118172.	2024-12-01 11:25:31 +08:00
Lee Wei	1ca64c5fb7	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#115691 ) This PR aims to remove undefined behavior from tests under the directory `llvm/transforms/CodegenPrepare, ConstantHoisting, Coroutines` etc.	2024-11-11 12:56:31 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
goldsteinn	1e072ae289	[CGP] [CodeGenPrepare] Folding `urem` with loop invariant value plus offset (#104724 ) This extends the existing fold: ``` for(i = Start; i < End; ++i) Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` To work with a non-zero `IncrLoopInvariant`. This is a common usage in cases such as: ``` for(i = 0; i < N; ++i) if ((i + 1) % X) == 0) do_something_occasionally_but_not_first_iter(); ``` Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout): https://alive2.llvm.org/ce/z/6tgyN3 Exhaust proof over all uint8_t combinations in C++: https://godbolt.org/z/WYa561388	2024-10-31 09:14:33 -05:00
Antonio Frighetto	d79c4c1119	[CGP] Regenerate `revert-constant-ptr-propagation-on-calls.ll` test (NFC) Multiple buildbots were previously failing.	2024-09-02 09:55:43 +02:00
Antonio Frighetto	e4e0dfb0c2	[CGP] Undo constant propagation of pointers across calls It may be profitable to revert SCCP propagation of C++ static values, if such constants are pointers, in order to avoid redundant pointer computation, since the method returning the constant is non-removable.	2024-09-02 09:33:23 +02:00
Antonio Frighetto	ed6d9f6d2a	[CGP] Introduce test for PR102926 (NFC)	2024-09-02 09:33:23 +02:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Simon Pilgrim	f673882323	[X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885 ) Currently targets without LZCNT/TZCNT won't speculate with BSR/BSF instructions in case they have a zero value input, meaning we always insert a test+branch for the zero-input case. This patch proposes we allow speculation if the target has CMOV, and perform a branchless select instead to handle the zero input case. This will predominately help x86-64 targets where we haven't set any particular cpu target. We already always perform BSR/BSF instructions if we were lowering a CTLZ/CTTZ_ZERO_UNDEF instruction.	2024-08-22 11:11:00 +01:00
Sjoerd Meijer	70e8c982d0	[AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484 ) The added test triggers the following assert in `areExtractShuffleVectors` that is called from `shouldSinkOperands`: Assertion `(!isScalable() \|\| isZero()) && "Request for a fixed element count on a scalable object"' failed. I don't think scalable types can be extract shuffles, so bail early if this is the case.	2024-08-21 15:27:09 +01:00
Noah Goldstein	e4c67ba67e	Recommit "[CodeGenPrepare] Folding `urem` with loop invariant value" Was missing remainder on `Start` value. Also changed logic as as nikic suggested (getting loop from `PN` instead of `Rem`). The prior impl increased the complexity of the code and made debugging it more difficult. Closes #104877	2024-08-20 09:17:49 -07:00
Noah Goldstein	9b25ad818c	[CodeGenPrepare][X86] Add tests for fixing `urem` transform; NFC	2024-08-20 09:17:49 -07:00
Noah Goldstein	731ae694a3	Revert "[CodeGenPrepare] Folding `urem` with loop invariant value" This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d. Seems to be causing stage2 failures on buildbots. Reverting while I investigate.	2024-08-18 20:36:35 -07:00
Noah Goldstein	c64ce8bf28	[CodeGenPrepare] Folding `urem` with loop invariant value ``` for(i = Start; i < End; ++i) Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` In its current state, only if `IncrLoopInvariant` and `Start` both being zero. Alive2 seemed unable to prove this (see: https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still checks out...) so wrote an exhaustive test here: https://godbolt.org/z/WYa561388 Closes #96625	2024-08-18 15:58:24 -07:00
Noah Goldstein	f16125a13c	[CodeGenPrepare][X86] Add tests for folding `urem` with loop invariant value; NFC	2024-08-18 15:58:24 -07:00
David Green	0e124537aa	[AArch64] Sink operands to fmuladd. (#102297 ) A fmuladd can be treated as a fma when sinking operands to the intrinsic, similar to D126234. Addresses a small part of #102195	2024-08-09 11:48:37 +01:00
Fangrui Song	8ea31db272	[CodeGenPrepare] Use MapVector to stabilize iteration order DenseMap iteration order is not guaranteed to be deterministic. Without the change, llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would fail when `combineHashValue` changes (#95970). Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d	2024-06-18 17:19:51 -07:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Nikita Popov	d10b76552f	[ConstantFold] Remove notional over-indexing fold (#93697 ) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities.	2024-05-30 08:36:44 +02:00

1 2 3 4 5 ...

462 Commits