llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	59e601a3d5	[CodeGenPrepare] Don't simplify incomplete expression tree in AddrModeCombine (#164628 ) Since new select/phi instructions may construct loops, the expression tree to be simplified may still be incomplete (i.e., it may contain select with dummy values or phi without incoming values). This patch removes the call to simplifyInstruction for now, as it doesn't break existing tests. Original PR: https://reviews.llvm.org/D36073 Fix the crash reported in https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732.	2025-10-25 16:47:32 +08:00
paperchalice	3c2dae6919	[test][Transforms] Remove unsafe-fp-math uses part 1 (NFC) (#164742 ) Post cleanup for #164534.	2025-10-23 11:39:11 +08:00
Nikita Popov	573ca36753	[IR] Replace alignment argument with attribute on masked intrinsics (#163802 ) The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter` intrinsics currently accept a separate alignment immarg. Replace this with an `align` attribute on the pointer / vector of pointers argument. This is the standard representation for alignment information on intrinsics, and is already used by all other memory intrinsics. This means the signatures now match llvm.expandload, llvm.vp.load, etc. (Things like llvm.memcpy used to have a separate alignment argument as well, but were already migrated a long time ago.) It's worth noting that the masked.gather and masked.scatter intrinsics previously accepted a zero alignment to indicate the ABI type alignment of the element type. This special case is gone now: If the align attribute is omitted, the implied alignment is 1, as usual. If ABI alignment is desired, it needs to be explicitly emitted (which the IRBuilder API already requires anyway).	2025-10-20 08:50:09 +00:00
Vladimir Radosavljevic	be7f85168d	[CGP] Fix missing sign extension for base offset in optimizeMemoryInst (#161377 ) If we have integers larger than 64-bit we need to explicitly sign extend them, otherwise we will get wrong zero extended values.	2025-10-10 10:52:52 +00:00
Craig Topper	4be1099607	[RISCV] Improve fixed vector handling in isCtpopFast. (#158380 ) Previously we considered fixed vectors fast if Zvbb or Zbb is enabled. Zbb only helps if the vector type will end up being scalarized.	2025-09-16 09:47:09 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Paul Walker	be4a739a7f	[LLVM][CDP] Move AArch64 test into AArch64 directory.	2025-08-05 11:04:59 +00:00
Paul Walker	94d374ab6c	[LLVM][CGP] Allow finer control for sinking compares. (#151366 ) Compare sinking is selectable based on the result of hasMultipleConditionRegisters. This function is too coarse grained by not taking into account the differences between scalar and vector compares. This PR extends the interface to take an EVT to allow finer control. The new interface is used by AArch64 to disable sinking of scalable vector compares, but with isProfitableToSinkOperands updated to maintain the cases that are specifically tested.	2025-08-05 11:43:41 +01:00
Yingwei Zheng	2d0ca09305	[CodeGenPrepare] Make sure that `AddOffset` is also a loop invariant (#150625 ) Closes https://github.com/llvm/llvm-project/issues/150611.	2025-07-26 00:23:56 +08:00
Philip Reames	48ef55ce3e	[CGP] Update tests to use autogen scripts, and refresh check lines Reducing manual update work required for an upcoming change.	2025-07-03 11:36:33 -07:00
Evgenii Kudriashov	5ffdd9480d	[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization (#143566 ) Follow up on #139303	2025-06-28 23:30:03 +02:00
Philip Reames	0ef27186c9	[tests] Additional coverage for gather/scatter address optimizations	2025-06-26 11:50:57 -07:00
Florian Hahn	dde30a4731	[CGP] Bail out if (Base\|Scaled)Reg does not dominate insert point. (#142949 ) (Base\|Scaled)Reg may not dominate the chosen insert point, if there are multiple uses of the address. Bail out if that's the case, otherwise we will generate invalid IR. In some cases, we could probably adjust the insert point or hoist the (Base\|Scaled)Reg. Fixes https://github.com/llvm/llvm-project/issues/142830. PR: https://github.com/llvm/llvm-project/pull/142949	2025-06-06 12:38:30 +01:00
weiguozhi	59c6d70ed8	[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst (#139303 ) Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to find appropriate insert position for the sunk instruction. Fixes #138208.	2025-05-15 09:27:25 -07:00
Orlando Cazalet-Hyams	60d0bc1fae	Propagate DebugLocs on phis in BreakCriticalEdges (#133492 ) The pull request discusses whether this change is needed or not. We leant towards "it can't hurt" on the basis that it's at worst slightly unecessary (but not incorret). The motivation for the patch came from reviewing code duplication sites to update for Key Instructions, finding this, trying to generate a test case and seeing the DebugLocs aren't propagated.	2025-05-08 13:01:48 +01:00
Sergei Barannikov	5080a0251f	[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731 ) DAG combiner already does this transformation, but in some cases it does not have a chance because either CodeGenPrepare or SelectionDAGBuilder move icmp to a different basic block. https://alive2.llvm.org/ce/z/ARzh99 Fixes #94829 Pull Request: https://github.com/llvm/llvm-project/pull/102731	2025-04-23 08:54:10 +03:00
Nikita Popov	20507a9e95	[Verifier][CGP] Allow integer argument to dbg_declare (#134803 ) Relaxes the newly added verifier rule to also allow an integer argument to dbg_declare, which is interpreted as a pointer. Adjust CGP to deal with it gracefully. Fixes https://github.com/llvm/llvm-project/issues/134523. Alternative to https://github.com/llvm/llvm-project/pull/134601.	2025-04-10 12:29:56 +02:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Mingming Liu	5399782508	[IR] Generalize Function's {set,get}SectionPrefix to GlobalObjects, the base class of {Function, GlobalVariable, IFunc} (#125757 ) This is a split of https://github.com/llvm/llvm-project/pull/125756	2025-02-06 14:51:13 -08:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Stephen Tozer	822f74a911	[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767 ) This patch contains a number of changes relating to the above flag; primarily it updates comment references to the old flag names, "-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names, "-fextend-variable-liveness[={all,this}]". These changes are all NFC. This patch also removes the explicit -fextend-this-ptr-liveness flag alias, and shortens the help-text for the main flag; these are both changes that were meant to be applied in the initial PR (#110000), but due to some user-error on my part they were not included in the merged commit.	2025-01-28 18:25:32 +00:00
David Sherwood	346185c42c	[AArch64] Improve codegen of vectorised early exit loops (#119534 ) Once PR #112138 lands we are able to start vectorising more loops that have uncountable early exits. The typical loop structure looks like this: vector.body: ... %pred = icmp eq <2 x ptr> %wide.load, %broadcast.splat ... %or.reduc = tail call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> %pred) %iv.cmp = icmp eq i64 %index.next, 4 %exit.cond = or i1 %or.reduc, %iv.cmp br i1 %exit.cond, label %middle.split, label %vector.body middle.split: br i1 %or.reduc, label %found, label %notfound found: ret i64 1 notfound: ret i64 0 The problem with this is that %or.reduc is kept live after the loop, and since this is a boolean it typically requires making a copy of the condition code register. For AArch64 this requires an additional cset instruction, which is quite expensive for a typical find loop that only contains 6 or 7 instructions. This patch attempts to improve the codegen by sinking the reduction out of the loop to the location of it's user. It's a lot cheaper to keep the predicate alive if the type is legal and has lots of registers for it. There is a potential downside in that a little more work is required after the loop, but I believe this is worth it since we are likely to spend most of our time in the loop.	2025-01-06 13:17:14 +00:00
Yingwei Zheng	6568ceb9fa	[CodeGenPrepare] Drop nsw flags in `optimizeLoadExt` (#118180 ) Alive2: https://alive2.llvm.org/ce/z/pMcD7q Closes https://github.com/llvm/llvm-project/issues/118172.	2024-12-01 11:25:31 +08:00
Lee Wei	1ca64c5fb7	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#115691 ) This PR aims to remove undefined behavior from tests under the directory `llvm/transforms/CodegenPrepare, ConstantHoisting, Coroutines` etc.	2024-11-11 12:56:31 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
goldsteinn	1e072ae289	[CGP] [CodeGenPrepare] Folding `urem` with loop invariant value plus offset (#104724 ) This extends the existing fold: ``` for(i = Start; i < End; ++i) Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` To work with a non-zero `IncrLoopInvariant`. This is a common usage in cases such as: ``` for(i = 0; i < N; ++i) if ((i + 1) % X) == 0) do_something_occasionally_but_not_first_iter(); ``` Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout): https://alive2.llvm.org/ce/z/6tgyN3 Exhaust proof over all uint8_t combinations in C++: https://godbolt.org/z/WYa561388	2024-10-31 09:14:33 -05:00
Antonio Frighetto	d79c4c1119	[CGP] Regenerate `revert-constant-ptr-propagation-on-calls.ll` test (NFC) Multiple buildbots were previously failing.	2024-09-02 09:55:43 +02:00
Antonio Frighetto	e4e0dfb0c2	[CGP] Undo constant propagation of pointers across calls It may be profitable to revert SCCP propagation of C++ static values, if such constants are pointers, in order to avoid redundant pointer computation, since the method returning the constant is non-removable.	2024-09-02 09:33:23 +02:00
Antonio Frighetto	ed6d9f6d2a	[CGP] Introduce test for PR102926 (NFC)	2024-09-02 09:33:23 +02:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Simon Pilgrim	f673882323	[X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885 ) Currently targets without LZCNT/TZCNT won't speculate with BSR/BSF instructions in case they have a zero value input, meaning we always insert a test+branch for the zero-input case. This patch proposes we allow speculation if the target has CMOV, and perform a branchless select instead to handle the zero input case. This will predominately help x86-64 targets where we haven't set any particular cpu target. We already always perform BSR/BSF instructions if we were lowering a CTLZ/CTTZ_ZERO_UNDEF instruction.	2024-08-22 11:11:00 +01:00
Sjoerd Meijer	70e8c982d0	[AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484 ) The added test triggers the following assert in `areExtractShuffleVectors` that is called from `shouldSinkOperands`: Assertion `(!isScalable() \|\| isZero()) && "Request for a fixed element count on a scalable object"' failed. I don't think scalable types can be extract shuffles, so bail early if this is the case.	2024-08-21 15:27:09 +01:00
Noah Goldstein	e4c67ba67e	Recommit "[CodeGenPrepare] Folding `urem` with loop invariant value" Was missing remainder on `Start` value. Also changed logic as as nikic suggested (getting loop from `PN` instead of `Rem`). The prior impl increased the complexity of the code and made debugging it more difficult. Closes #104877	2024-08-20 09:17:49 -07:00
Noah Goldstein	9b25ad818c	[CodeGenPrepare][X86] Add tests for fixing `urem` transform; NFC	2024-08-20 09:17:49 -07:00
Noah Goldstein	731ae694a3	Revert "[CodeGenPrepare] Folding `urem` with loop invariant value" This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d. Seems to be causing stage2 failures on buildbots. Reverting while I investigate.	2024-08-18 20:36:35 -07:00
Noah Goldstein	c64ce8bf28	[CodeGenPrepare] Folding `urem` with loop invariant value ``` for(i = Start; i < End; ++i) Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` In its current state, only if `IncrLoopInvariant` and `Start` both being zero. Alive2 seemed unable to prove this (see: https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still checks out...) so wrote an exhaustive test here: https://godbolt.org/z/WYa561388 Closes #96625	2024-08-18 15:58:24 -07:00
Noah Goldstein	f16125a13c	[CodeGenPrepare][X86] Add tests for folding `urem` with loop invariant value; NFC	2024-08-18 15:58:24 -07:00
David Green	0e124537aa	[AArch64] Sink operands to fmuladd. (#102297 ) A fmuladd can be treated as a fma when sinking operands to the intrinsic, similar to D126234. Addresses a small part of #102195	2024-08-09 11:48:37 +01:00
Fangrui Song	8ea31db272	[CodeGenPrepare] Use MapVector to stabilize iteration order DenseMap iteration order is not guaranteed to be deterministic. Without the change, llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would fail when `combineHashValue` changes (#95970). Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d	2024-06-18 17:19:51 -07:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Nikita Popov	d10b76552f	[ConstantFold] Remove notional over-indexing fold (#93697 ) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities.	2024-05-30 08:36:44 +02:00
wanglei	9d4f7f44b6	[test][LoongArch] Add -mattr=+d option. NFC Because most of tests assume target-abi=`lp64d`, adding the corresponding feature is reasonable. rg -l loongarch -g '!*.s' \| xargs sed -i '/mtriple=loongarch/ {/-mattr=/!{/target-abi/! s/mtriple=loongarch.. /&-mattr=+d /}}'	2024-05-14 20:23:04 +08:00
Yingwei Zheng	ab12bba0aa	[CGP] Drop poison-generating flags after hoisting (#90382 ) See the following case: ``` define i8 @src1(i8 %x) { entry: %cmp = icmp eq i8 %x, -1 br i1 %cmp, label %exit, label %if.then if.then: %inc = add nuw nsw i8 %x, 1 br label %exit exit: %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ] ret i8 %retval } define i8 @tgt1(i8 %x) { entry: %inc = add nuw nsw i8 %x, 1 %0 = icmp eq i8 %inc, 0 br i1 %0, label %exit, label %if.then if.then: ; preds = %entry br label %exit exit: ; preds = %if.then, %entry %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ] ret i8 %retval } ``` `optimizeBranch` converts `icmp eq X, -1` into cmp to zero on RISC-V and hoists the add into the entry block. Poison-generating flags should be dropped as they don't still hold. Proof: https://alive2.llvm.org/ce/z/sP7mvK Fixes https://github.com/llvm/llvm-project/issues/90380	2024-04-29 15:51:49 +08:00
Alex Bradbury	1c8410a67d	[CodeGenPrepare] Preserve flags (such as nsw/nuw) in SinkCast (#89904 ) As demonstrated in the test change, when deciding to sink a trunc we were losing its flags. This patch moves to cloning the original instruction instead.	2024-04-25 15:05:07 +01:00
Alex Bradbury	2554a85c03	[CodeGenPrepare][test] Add test for sinking of truncs demonstrating nsw/nuw are dropped SinkCast creates a new cast with the same type and inputs, which drops the nsw/nuw flags. Reviewed as part of <https://github.com/llvm/llvm-project/pull/89904> but split out so I can land the test separately.	2024-04-25 15:01:55 +01:00
Matthias Braun	652bcf685c	CodeGenPrepare: Add support for llvm.threadlocal.address address-mode sinking (#87844 ) Depending on the TLSMode many thread-local accesses on x86 can be expressed by adding a %fs: segment register to an addressing mode. Even if there are mutliple users of a `llvm.threadlocal.address` intrinsic it is generally not worth sharing the value in a register but instead fold the %fs access into multiple addressing modes. Hence this changes CodeGenPrepare to duplicate the `llvm.threadlocal.address` intrinsic as necessary. Introduces a new `TargetLowering::addressingModeSupportsTLS` callback that allows targets to indicate whether TLS accesses can be part of an addressing mode. This is fixing a performance problem, as this folding of TLS-accesses into multiple addressing modes happened naturally before the introduction of the `llvm.threadlocal.address` intrinsic, but regressed due to `SelectionDAG` keeping things in registers when accessed across basic blocks, so CodeGenPrepare needs to duplicate to mitigate this. We see a ~0.5% recovery in a codebase with heavy TLS usage (HHVM). This fixes most of #87437	2024-04-17 12:48:02 -07:00
wanglei	8e4b0890a6	[LoongArch] Return true from shouldConsiderGEPOffsetSplit (#88371 ) If not performing gep splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates.	2024-04-15 09:01:04 +08:00
wanglei	5fc8a190b3	[LoongArch] Pre commit test for #88371 . NFC	2024-04-12 17:57:28 +08:00
Yingwei Zheng	38a44bdc93	[CodeGenPrepare] Reverse the canonicalization of isInf/isNanOrInf (#81572 ) In commit `2b582440c1`, we canonicalize the isInf/isNanOrInf idiom into fabs+fcmp for better analysis/codegen (See also the discussion in https://github.com/llvm/llvm-project/pull/76338). This patch reverses the fabs+fcmp to `is.fpclass`. If the `is.fpclass` is not supported by the target, it will be expanded by TLI. Fixes the regression introduced by `2b582440c1` and https://github.com/llvm/llvm-project/pull/80414#issuecomment-1936374206.	2024-03-18 18:27:45 +08:00
Stephen Tozer	d128448efd	Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )"" Reverted due to some test failures on some buildbots. https://lab.llvm.org/buildbot/#/builders/67/builds/14669 This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.	2024-02-27 10:17:24 +00:00

1 2 3 4 5 ...

453 Commits