llvm-project

Author	SHA1	Message	Date
Joshua Cao	bd382032f6	[BBUtils][NFC] Delete SplitLandingPadPredecessors with DT (#73406 ) Function is marked for deprecation. There is only one consumer which is converted to use DomTreeUpdater.	2023-12-02 11:33:43 -08:00
Simon Pilgrim	3246a32d3f	Fix MSVC "not all control paths return a value" warning. NFC.	2023-11-30 10:07:01 +00:00
Philip Reames	e947f95337	[LSR][TTI][RISCV] Enable terminator folding for RISC-V If looking for a miscompile revert candidate, look here! The transform being enabled prefers comparing to a loop invariant exit value for a secondary IV over using an otherwise dead primary IV. This increases register pressure (by requiring the exit value to be live through the loop), but reduces the number of instructions within the loop by one. On RISC-V which has a large number of scalar registers, this is generally a profitable transform. We loose the ability to use a beqz on what is typically a count down IV, and pay the cost of computing the exit value on the secondary IV in the loop preheader, but save an add or sub in the loop body. For anything except an extremely short running loop, or one with extreme register pressure, this is profitable. On spec2017, we see a 0.42% geomean improvement in dynamic icount, with no individual workload regressing by more than 0.25%. Code size wise, we trade a (possibly compressible) beqz and a (possibly compressible) addi for a uncompressible beq. We also add instructions in the preheader. Net result is a slight regression overall, but neutral or better inside the loop. Previous versions of this transform had numerous cornercase correctness bugs. All of them ones I can spot by inspection have been fixed, and I have run this through all of spec2017, but there may be further issues lurking. Adding uses to an IV is a fraught thing to do given poison semantics, so this transform is somewhat inherently risky. This patch is a reworked version of D134893 by @eop. That patch has been abandoned since May, so I picked it up, reworked it a bit, and am landing it.	2023-11-29 12:04:06 -08:00
Nikita Popov	e408f70524	[LSR] Avoid use of ConstantExpr::getCast() (NFC) Use the constant folding API instead, which must succeed as we're working on a ConstantInt.	2023-11-02 09:48:04 +01:00
David Green	54e5de08d4	[ARM][LSR] Exclude uses outside the loop when favoring postinc. (#67090 ) Extra uses for variables outside the loop can mess with the generation of postinc variables. This patch alters the collection of loop invariant fixups in LSR when the target is optimizing for PostInc, to exclude the collection of these extra uses. It is expected that the variable can be rematerialized, which will lead to a more optimal sequence of instructions in the loop.	2023-09-25 10:09:36 +01:00
Nikita Popov	d35e5afc87	[LSR] Simplify type check for opaque pointers (NFC) For pointer types, checking the address space is the same as type equality now, so we no longer need the special case.	2023-09-22 10:23:04 +02:00
Bjorn Pettersson	db456dc6ba	[LSR] Drop support for typed pointers The opaque pointers are already "canonicalized". So remove the redundant/obsolete code.	2023-09-07 16:37:45 +02:00
Nuno Lopes	d8e2821170	[LSR] Use poison instead of undef as placeholder [NFC] This value is patcher afterwards, and only used temporarily during dbg info construction	2023-07-23 15:57:21 +01:00
Nikita Popov	ddb46abd3c	[LSR] Don't consider users of constant outside loop In CollectLoopInvariantFixupsAndFormulae(), LSR looks at users outside the loop. E.g. if we have an addrec based on %base, and %base is also used outside the loop, then we have to keep it in a register anyway, which may make it more profitable to use %base + %idx style addressing. This reasoning doesn't hold up when the base is a constant, because the constant can be rematerialized. The lsr-memcpy.ll test regressed when enabling opaque pointers, because inttoptr (i64 6442450944 to ptr) now also has a use outside the loop (previously it didn't due to a pointer type difference), and that extra "use" results in worse use of addressing modes in the loop. However, the use outside the loop actually gets rematerialized, so the alleged register saving does not occur. The same reasoning also applies to other types of constants, such as global variable references. Differential Revision: https://reviews.llvm.org/D155073	2023-07-13 12:22:38 +02:00
Florian Hahn	7f5b15ad15	[LSR] Move normalization check to normalizeForPostIncUse. Move the logic added in 3a57152d85e1 to normalizeForPostIncUse to catch additional un-invertable cases. This fixes another mis-compile pointed out by @peixin in D153004.	2023-07-04 11:56:51 +01:00
Florian Hahn	3a57152d85	[LSR] Return nullptr from getExpr if the result isn't invertible. getExpr is missing a check to make sure the result is invertible. This can lead to incorrect results, so return nullptr in those cases like in other places in IVUsers. Fixes #62660. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D153202	2023-06-22 19:10:48 +01:00
Luke Lau	438cc10b8e	[IR] Add getAccessType to Instruction There are multiple places in the code where the type of memory being accessed from an instruction needs to be obtained, including an upcoming patch to improve GEP cost modeling. This deduplicates the logic between them. It's not strictly NFC as EarlyCSE/LoopStrengthReduce may catch more intrinsics now. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D150583	2023-06-21 16:17:25 +01:00
Florian Hahn	dae5cd73cb	Recommit "[LSR] Consider post-inc form when creating extends/truncates." This reverts the revert commit 1797ab36efc9c90c921cd725831f8c3f6a7125a2. The recommitted version now checks the PostIncLoopSets for all fixups and returns nullptr if the result doesn't match for all fixups.	2023-06-19 17:57:06 +01:00
Florian Hahn	1797ab36ef	Revert "[LSR] Consider post-inc form when creating extends/truncates." This reverts commit abfeda5af329b5889db709ff74506e20e0b569e9. and fe19036e1266d2a90b44725c82b898134906e4c3. The added assertion triggers during clang bootstrap builds. Revert while I investigate.	2023-06-17 17:58:41 +01:00
Florian Hahn	abfeda5af3	[LSR] Consider post-inc form when creating extends/truncates. GenerateTruncates at the moment creates extends/truncates for post-inc uses of normalized expressions. For example, if an add rec of the form {1,+,-1} is used outside the loop, the normalized form will use {1,+,-1} instead of {0,+,-1}. When naively sign-extending the normalized expression, it will get extended incorrectly to {1,+,-1} for the wider type, if the backedge-taken count of the loop is 1. To address this, the patch updates GenerateTruncates to check if the LSRUse contains any fixups with PostIncLoops. If that's the case, first de-normalize the expression, then perform the extend/truncate, then normalize again. There may be other places where similar checks are needed and the helper can be generalized for those cases. I'd not be surprised if other subtle mis-compiles are caused by this. Fixes #38847. Fixes #58039. Fixes #62852. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153004	2023-06-17 09:58:37 +01:00
Nikita Popov	143ed21b26	Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)" This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7. In preparation for reverting a dependent revision.	2023-06-05 16:45:38 +02:00
Nikita Popov	d5c56c5162	[SCEVExpander] Remember phi nodes inserted by LCSSA construction SCEVExpander keeps track of all instructions it inserted. However, it currently misses some phi nodes created during LCSSA construction. Fix this by collecting these into another argument. This also removes the IRBuilder argument, which was added for essentially the same purpose, but only handles the root LCSSA nodes, not those inserted by SSAUpdater. This was reported as a regression on D149344, but the reduced test case also reproduces without it. Differential Revision: https://reviews.llvm.org/D150681	2023-05-25 09:34:19 +02:00
Nikita Popov	5362a0d859	[LCSSA] Remove unused ScalarEvolution argument (NFC) After D149435, LCSSA formation no longer needs access to ScalarEvolution, so remove the argument from the utilities.	2023-05-02 12:17:05 +02:00
Momchil Velikov	6c9066fe2e	Recommit "[AArch64] Fix incorrect `isLegalAddressingMode`" This patch recommits 0827e2fa3fd15b49fd2d0fc676753f11abb60cab after reverting it in ed7ada259f665a742561b88e9e6c078e9ea85224. Added workround for `Targetlowering::AddrMode` no longer being an aggregate in C++20. `AArch64TargetLowering::isLegalAddressingMode` has a number of defects, including accepting an addressing mode, which consists of only an immediate operand, or not checking the offset range for an addressing mode in the form `1*ScaledReg + Offs`. This patch fixes the above issues. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D143895 Change-Id: I41a520c13ce21da503ca45019979bfceb8b648fa	2023-04-21 16:21:01 +01:00
Momchil Velikov	ed7ada259f	Revert "[AArch64] Fix incorrect `isLegalAddressingMode`" This reverts commit 0827e2fa3fd15b49fd2d0fc676753f11abb60cab. Failing buildbot, perhaps due to `-std=c++20`.	2023-04-20 16:10:45 +01:00
Momchil Velikov	0827e2fa3f	[AArch64] Fix incorrect `isLegalAddressingMode` `AArch64TargetLowering::isLegalAddressingMode` has a number of defects, including accepting an addressing mode which consists of only an immediate operand, or not checking the offset range for an addressing mode in the form `1*ScaledReg + Offs`. This patch fixes the above issues. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D143895 Change-Id: I756fa21941844ded44f082ac7eea4391219f9851	2023-04-20 15:43:11 +01:00
Kazu Hirata	7b014a0732	[Scalar] Use range-based for loops (NFC)	2023-04-16 09:05:20 -07:00
Dmitry Makogon	3d7242f05e	Reapply "[LSR] Preserve LCSSA when rewriting instruction with PHI user" This reverts commit efd34ba60f3839b0a68b2e32ff9011b6823bc16f. Reapplies 8ff4832679e1. Missed a failing test. Needed to just update test checks.	2023-04-06 17:31:27 +07:00
Nico Weber	efd34ba60f	Revert "[LSR] Preserve LCSSA when rewriting instruction with PHI user" This reverts commit 8ff4832679e1ff2d2a1cfaa45bb5cb995b0685a1. Breaks tests, see https://reviews.llvm.org/D146811#4232839	2023-03-30 06:40:16 -04:00
Dmitry Makogon	8ff4832679	[LSR] Preserve LCSSA when rewriting instruction with PHI user Fixes https://github.com/llvm/llvm-project/issues/61182. LoopStrengthReduce may sometimes break LCSSA form when applying a rewrite for an instruction used in a PHI. It happens if: - The PHI is in a loop exit block, - The edge from the corresponding exiting block to that exit is critical, - The PHI has at least two inputs coming from loop blocks, - and the rewritten instruction is inserted in the loop. In such case we split the critical edge and then replace PHI inputs with the rewritten instruction. However ExitBlock is no longer a loop exit, so LCSSA form is broken. This patch fixes it by collecting all inserted instructions for PHIs whose parent block is not a loop exit and then forming LCSSA for them. Differential Revision: https://reviews.llvm.org/D146811	2023-03-30 14:46:28 +07:00
Philip Reames	6afcc54ac7	[SCEV] Infer no-self-wrap via constant ranges Without this, pointer IVs in loops with small constant trip counts couldn't be proven no-self-wrap. This came up in a new LSR transform, but may also benefit other SCEV consumers as well. Differential Revision: https://reviews.llvm.org/D146596	2023-03-22 12:06:28 -07:00
Philip Reames	00fdd2cb6c	[LSR] Don't crash on non-branch terminator in -lsr-term-fold Reported in https://reviews.llvm.org/D146415. I rewrote the patch and aded the test case. Per that report, spec2006.483.xalancbmk crashes without this fix.	2023-03-21 09:30:01 -07:00
Philip Reames	e9df5d62c8	[LSR] Remove a couple stale comments in lsr-term-fold	2023-03-21 09:21:30 -07:00
Philip Reames	53e9a5ddc0	[LSR] Fix "new use of poison" problem in lsr-term-fold This models the approach used in LFTR. The short summary is that we need to prove the IV is not dead first, and then we have to either prove the poison flag is valid after the new user or delete it. There are two key differences between this and LFTR. First, I allow a non-concrete start to the IV. The goal of LFTR is to canonicalize and IVs with constant starts are canonical, so the very restrictive definition there is mostly okay. Here on the other hand, we're explicitly moving away from the canonical form, and thus need to handle non-constant starts. Second, LFTR bails out instead of removing inbounds on a GEP. This is a pragmatic tradeoff since inbounds is hard to infer and assists aliasing. This pass runs very late, and I think the tradeoff runs the other way. A different approach we could take for the post-inc check would be to perform a pre-inc check instead of a post-inc check. We would still have to check the pre-inc IV, but that would avoid the need to drop inbounds. Doing the pre-inc check would basically trade killing a whole IV for an extra register move in the loop. I'm open to suggestions on the right approach here. Note that this analysis is quite expensive compile time wise. I have made no effort to optimize (yet). Differential Revision: https://reviews.llvm.org/D146464	2023-03-21 08:23:40 -07:00
Philip Reames	b33f5e7ed3	[LSR] Use evaluateAtIteration in lsr-term-fold This is a follow up to one of the side discussions on D146429. There are two semantic changes contained here. The motivation for the change to the legality condition introduced in D146429 comes from the fact that we only check the post-inc form. As such, as long as the values of the post-inc variable don't self wrap, it's actually okay if we wrap past the starting value of the pre-inc IV. Second, Nikic noticed during review that the test changes changed behavior for TC=0 (i.e. N=0 in the tests). On more careful inspection, it became apparent that the previous manual expansion code was incorrect in the case where the primary IV could wrap without poison, and started with the limit value (i.e. i8 post-inc starts at 255 for 0 exit test, implying pre-inc starts with 0). See @wrap_around test for an example of the (previous) miscompile. Differential Revision: https://reviews.llvm.org/D146457	2023-03-21 08:11:36 -07:00
Philip Reames	091422adc1	[LSR] Fix wrapping bug in lsr-term-fold logic The existing logic was unsound, in two ways. First, due to wrapping on the trip count computation, it could compute a value which convert a loop which exiting on iteration 256, to one which exited at 255. (With i8 trip counts.) Second, it allowed rewriting when the trip count implies wrapping around the alternate IV. As a trivial example, it allowed rewriting an i128 exit test in terms of an i64 IV. This is obviously wrong. Note that the test change is fairly minimal - i.e. only the targeted test - but that's only because I precommitted a change which switched the test from 32 to 64 bit pointers. For 32 bit point architectures with 32 bit primary inductions, this transform is almost always unsound to perform. Differential Revision: https://reviews.llvm.org/D146429	2023-03-20 13:47:21 -07:00
Philip Reames	272ebd6957	[LSR] Inline getAlternateIVEnd and simplify [nfc] Also, add a comment to highlight that the "good" result on this test is accidental, and not based on a principled decision. I matched the original behavior to make this nfc, but selecting the last legal IV is not well motivated here.	2023-03-20 11:22:21 -07:00
Philip Reames	b9521484ec	[LSR] Rewrite IV match for term-fold using existing utilities Main benefit here is making the logic easier to follow, slightly more efficient, and more in line with LFTR. This is not NFC. There are three semantic changes here. First, we drop handling for constants on the LHS of the comparison. These are non-canonical, and we're very late in the optimization pipeline here, so there's no point in supporting this. I removed a test which covered this case. Second, we don't need the almost dead IV to be an addrec. We just need SCEV to be able to compute a trip count for it. Third, we require a simple IV for the almost dead IV. In theory, this removes cases we could have previously handled, but given a) zero testing and b) multiple known correctness issues, I'm adopting an attidute of narrowing this down to something which works correctly, and then expanding.	2023-03-20 10:41:01 -07:00
Mark Goncharov	e4dd7ec39f	[LSR] Fold terminating condition not only for eq and ne. Add opportunity to fold any icmp instruction.	2023-03-20 13:42:27 +03:00
Philip Reames	c60e0b66ad	[LSR] Another minor code style improvement [nfc]	2023-03-17 08:09:01 -07:00
Philip Reames	cd47f5bb59	[LSR] Minor code style improvement [nfc]	2023-03-17 07:50:59 -07:00
Paul Walker	62d11b2cca	Revert "Revert "[SCEV] Add SCEVType to represent `vscale`."" Relanding after fixing Polly related build error. This reverts commit 7b26dcae9eaf8cdcba7fef032fd83d060dffd4b4.	2023-03-02 13:14:07 +00:00
Paul Walker	7b26dcae9e	Revert "[SCEV] Add SCEVType to represent `vscale`." This reverts commit 7912f5cc92f65ad0d3c705f3683a0b69dbedcc57.	2023-03-02 11:59:50 +00:00
Paul Walker	7912f5cc92	[SCEV] Add SCEVType to represent `vscale`. This is part of an effort to remove ConstantExpr based representations of `vscale` so that its LangRef definiton can be relaxed to accommodate a less strict definition of constant. Differential Revision: https://reviews.llvm.org/D144891	2023-03-02 11:11:36 +00:00
David Green	74b67e53c6	[LSR] Fix incorrect check in 73cd3d4391ad47ae7 I missed that the test needed a icelake-server cpu to fail, and left a testing "false &&" in the if condition. Hopefully this is now the correct fix.	2023-02-22 23:42:21 +00:00
David Green	73cd3d4391	[LSR] Prevent creating SCEVs of addrecs from mismatching loops LSR can include Regs of AddRec SCEVs from different loops, which do not combine well when added in Scalar Evolution. As they should never produce constant differences so we can just guard against trying to create them. Fixes #60927	2023-02-22 22:50:37 +00:00
Kazu Hirata	a28b252d85	Use APInt::getSignificantBits instead of APInt::getMinSignedBits (NFC) Note that getMinSignedBits has been soft-deprecated in favor of getSignificantBits.	2023-02-19 23:56:52 -08:00
Kazu Hirata	f8f3db2756	Use APInt::count{l,r}_{zero,one} (NFC)	2023-02-19 22:04:47 -08:00
David Green	7abe3497e7	[LSR] Improve filtered uses in NarrowSearchSpaceByPickingWinnerRegs NarrowSearchSpaceByPickingWinnerRegs has an aggressive filtering method to reduce the complexity of the search space down by picking a best formula with the highest number of reuses and assuming it will yield profitable reuse. In certain cases we can find a best formula like {X+30,+,1} and later check a formula like {X,+,1} with the same number of Uses. On some architectures it can be better to pick {X,+,1}, especially if an offset of 30 can be used as a legal addressing mode, but -30 cannot. That happens under Thumb1 code, which has fairly limited addressing modes. This patch adds a check to see if it can pick the simpler formula, if it looks more profitable. Differential Revision: https://reviews.llvm.org/D144014	2023-02-16 15:48:12 +00:00
chenglin.bi	14dedd9cf5	[Reland][LSR] Hoist IVInc to loop header if its all uses are in the loop header Original code will cause crash when the load/store memory type is structure because isIndexedLoadLegal/isIndexedStore doesn't support struct type. So we limit the load/store memory type to integer. Origin commit message: When the latch block is different from header block, IVInc will be expanded in the latch loop. We can't generate the post index load/store this case. But if the IVInc only used in the loop, actually we still can use the post index load/store because when exit loop we don't care the last IVInc value. So, trying to hoist IVInc to help backend to generate more post index load/store. Fix #53625 Reviewed By: eopXD Differential Revision: https://reviews.llvm.org/D138636	2023-02-10 16:52:00 +08:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Haojian Wu	778a582e8e	Fix a -Wunused-variable warning in release build.	2023-01-20 23:40:33 +01:00
Philip Reames	915bcb0629	[LSR] Style cleanup for code recently added in D132443 Also add FIXMEs which highlight correctness bugs in this recently added off by default option. These have also been raised on the original review.	2023-01-20 12:45:37 -08:00
Philip Reames	7ad786a29e	[LSR] Generalize one aspect of terminator folding (recently introduced in D132443) There's no need to require the start value to come directly from the loop predecessor. This was sometimes covering up a latent miscompile in this off-by-default option, but the miscompile needs fixed anyways and the issue has been raised on the original review. Differential Revision: https://reviews.llvm.org/D142240	2023-01-20 12:19:43 -08:00
chenglin.bi	b84ab1f7c9	Revert "[LSR] Hoist IVInc to loop header if its all uses are in the loop header" The original commit seems to cause a regression in numba test. This reverts commit b1b4758e7f4b2ffe1faa28b00eb037832e5d26a7.	2023-01-11 01:24:34 +08:00

1 2 3 4 5 ...

953 Commits