llvm-project

Author	SHA1	Message	Date
Stephen Tozer	86405ed101	[DebugInfo][Reassociate] Preserve DebugLocs when reassociating subs (#114226 ) In NegateValue in Reassociate, we return the negation of an existing value in order to break a subtract into an negate + add, potentially creating a new instruction to perform the negation, but we neglect to propagate the DebugLoc of the sub being replaced to the negate instruction if one is created. This patch adds that propagation. Found using https://github.com/llvm/llvm-project/pull/107279.	2024-11-08 18:35:03 +00:00
Kazu Hirata	2f55e55101	[Transforms] Use range-based for loops (NFC) (#98725 )	2024-07-14 13:44:50 -07:00
Kazu Hirata	37d3f44a58	[Transforms] Use range-based for loops (NFC) (#98465 )	2024-07-12 00:05:28 -07:00
Noah Goldstein	6e379de3b1	[Reassociate] Preserve `nuw` and `nsw` on `mul` chains Basically the same rules as `add` but we also need to ensure all operands a non-zero. Proofs: https://alive2.llvm.org/ce/z/jzsYht Closes #97040	2024-07-01 22:22:36 +08:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Nikita Popov	35eef9f97f	[Reassociate] Use poison instead of undef for dummy operands (NFCI) These will be replaced later.	2024-06-25 12:44:11 +02:00
Nikita Popov	5aaf2ab085	[Reassociate] Avoid use of ConstantExpr::getShl() Use the constant folding API instead.	2024-06-18 16:59:51 +02:00
Shan Huang	470d59d656	[DebugInfo][Reassociate] Fix missing debug location drop (#95355 ) Fix #95343 .	2024-06-17 09:06:20 +08:00
Kazu Hirata	7c6d0d26b1	[llvm] Use llvm::unique (NFC) (#95628 )	2024-06-14 22:49:36 -07:00
Yingwei Zheng	645fb04a33	[Reassociate] Use uint64_t for repeat count (#94232 ) This patch relands #91469 and uses `uint64_t` for repeat count to avoid a miscompilation caused by overflow https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.	2024-06-08 22:28:56 +08:00
Yingwei Zheng	22b63b97ff	Revert "[Reassociate] Drop weight reduction to fix issue 91417 (#91469 )" (#94210 ) Reverts `3bcccb6af6` and `9a282724a2` because #91469 causes a miscompilation https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.	2024-06-03 21:40:06 +08:00
Yingwei Zheng	3bcccb6af6	[Reassociate] Drop weight reduction to fix issue 91417 (#91469 ) See the following case: https://alive2.llvm.org/ce/z/A-fBki ``` define i3 @src(i3 %0) { %2 = mul i3 %0, %0 %3 = mul i3 %2, %0 %4 = mul i3 %3, %0 %5 = mul nsw i3 %4, %0 ret i3 %5 } define i3 @tgt(i3 %0) { %2 = mul i3 %0, %0 %5 = mul nsw i3 %2, %0 ret i3 %5 } ``` `d7aeefebd6` introduced weight reduction during weights combination of the same operand. As the weight of `%0` changes from 5 to 3, the nsw flag in `%5` should be dropped. However, the nsw flag isn't cleared by `RewriteExprTree` since `%5 = mul nsw i3 %0, %4` is not included in the range of `[ExpressionChangedStart, ExpressionChangedEnd)`. ``` Calculated Rank[] = 3 Combine negations for: %2 = mul i3 %0, %0 Calculated Rank[] = 4 Combine negations for: %3 = mul i3 %0, %2 Calculated Rank[] = 5 Combine negations for: %4 = mul i3 %0, %3 Calculated Rank[] = 6 Combine negations for: %5 = mul nsw i3 %0, %4 LINEARIZE: %5 = mul nsw i3 %0, %4 OPERAND: i3 %0 (1) ADD USES LEAF: i3 %0 (1) OPERAND: %4 = mul i3 %0, %3 (1) DIRECT ADD: %4 = mul i3 %0, %3 (1) OPERAND: i3 %0 (1) OPERAND: %3 = mul i3 %0, %2 (1) DIRECT ADD: %3 = mul i3 %0, %2 (1) OPERAND: i3 %0 (1) OPERAND: %2 = mul i3 %0, %0 (1) DIRECT ADD: %2 = mul i3 %0, %0 (1) OPERAND: i3 %0 (1) OPERAND: i3 %0 (1) RAIn: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RAOut: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RAOut after CSE reorder: mul i3 [ %0, #3] [ %0, #3] [ %0, #3] RA: %5 = mul nsw i3 %0, %4 TO: %5 = mul nsw i3 %4, %0 RA: %4 = mul i3 %0, %3 TO: %4 = mul i3 %0, %0 ``` The best way to fix this is to inform `RewriteExprTree` to clear flags of the whole expr tree when weight reduction happens. But I find that weight reduction based on Carmichael number never happens in practice. See the coverage result https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/home/dtcxzyw/llvm-project/llvm/lib/Transforms/Scalar/Reassociate.cpp.html#L323 I think it would be better to drop `IncorporateWeight`. Fixes #91417	2024-05-29 18:09:23 +08:00
Akshay Deodhar	73e22ff3d7	[Reassociate] Preserve NSW flags after expr tree rewriting (#93105 ) We can guarantee NSW on all operands in a reassociated add expression tree when: - All adds in an add operator tree are NSW, AND either - All add operands are guaranteed to be nonnegative, OR - All adds are also NUW - Alive2: - Nonnegative Operands - 3 operands: https://alive2.llvm.org/ce/z/G4XW6Q - 4 operands: https://alive2.llvm.org/ce/z/FWcZ6D - NUW NSW adds: https://alive2.llvm.org/ce/z/vRUxeC --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-05-28 11:05:38 -07:00
Jeremy Morse	2fe81edef6	[NFC][RemoveDIs] Insert instruction using iterators in Transforms/ As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.	2024-03-05 15:12:22 +00:00
Jeremy Morse	7e88d51760	[NFC][RemoveDIs] Have CreateNeg only accept iterators (#82999 ) Removing debug-intrinsics requires that we always insert with an iterator, not with an instruction position. To enforce that, we need to eliminate the `Instruction ` taking functions. It's safe to leave the insert-at-end-of-block functions as the intention is clear for debug info purposes (i.e., insert after both instructions and debug-info at the end of the function). This patch demonstrates how that needs to happen. At a variety of call-sites to the `CreateNeg` constructor we need to consider: Has this instruction been selected because of the operation it performs? In that case, just call `getIterator` and pass an iterator in. * Has this instruction been selected because of it's position? If so, we need to keep the iterator identifying that position (see the 3rd hunk changing Reassociate.cpp, although it's coincidentally not debug-info significant). This also demonstrates what we'll try and do with the constructor methods going forwards: have one fully explicit set of parameters including iterator, and another with default-arguments where the block-to-insert-into argument defaults to nullptr / no-position, creating an instruction that hasn't been inserted yet.	2024-02-29 13:00:29 +00:00
Yingwei Zheng	312cb34da6	[Reassociate] Preserve NUW flags after expr tree rewriting (#72360 ) Alive2: https://alive2.llvm.org/ce/z/38KiC_	2023-12-09 16:45:48 +08:00
Craig Topper	533a0856bf	Recommit "[Reassociate] Use disjoint flag to convert Or to Add. (#72772 )" Original message: We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.	2023-12-06 14:16:56 -08:00
Craig Topper	92fccea2e5	Revert "[Reassociate] Use disjoint flag to convert Or to Add. (#72772 )" This reverts commit 78964457cf1bafe57a54629fafbd081452a9e528. Looks like I didn't rebase this correctly before commit	2023-12-06 13:50:21 -08:00
Craig Topper	78964457cf	[Reassociate] Use disjoint flag to convert Or to Add. (#72772 ) We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.	2023-12-06 13:48:15 -08:00
Joshua Cao	72ffaa9156	[IR][TRE] Support associative intrinsics (#74226 ) There is support for intrinsics in Instruction::isCommunative, but there is no equivalent implementation for isAssociative. This patch builds support for associative intrinsics with TRE as an application. TRE can now have associative intrinsics as an accumulator. For example: ``` struct Node { Node next; unsigned val; } unsigned maxval(struct Node n) { if (!n) return 0; return std::max(n->val, maxval(n->next)); } ``` Can be transformed into: ``` unsigned maxval(struct Node n) { struct Node head = n; unsigned max = 0; // Identity of unsigned std::max while (true) { if (!head) return max; max = std::max(max, head->val); head = head->next; } return max; } ``` This example results in about 5x speedup in local runs. We conservatively only consider min/max and as associative for this patch to limit testing scope. There are probably other intrinsics that could be considered associative. There are a few consumers of isAssociative() that could be impacted. Testing has only required to Reassociate pass be updated.	2023-12-04 22:35:59 -08:00
Jeremy Morse	2425e2940e	[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149 ) Part of the "RemoveDIs" project to remove debug intrinsics requires passing block-positions around in iterators rather than as instruction pointers, allowing some debug-info to reside in BasicBlock::iterator. This means getInsertionPointAfterDef has to return an iterator, and as it can return no-instruction that means returning an optional iterator. This patch changes the signature for getInsertionPtAfterDef and then patches up the various places that use it to handle the different type. This would overall be an NFC patch, however in InstCombinerImpl::freezeOtherUses I've started skipping any debug intrinsics at the returned insert-position. This should not have any _meaningful_ effect on the compiler output: at worst it means variable assignments that are skipped will now cover the freeze instruction and anything inserted before it, which should be inconsequential. Sadly: this makes the function signature ugly. This is probably the ugliest piece of fallout for the "RemoveDIs" work, but it serves the overall purpose of improving compile times and not allowing `-g` to affect compiler output, so should be worthwhile in the end.	2023-11-30 12:19:57 +00:00
Nikita Popov	80fa5a6377	[ValueTracking] Use SimplifyQuery in haveNoCommonBitsSet() (NFC) Pass SimplifyQuery instead of unpacked list of arguments.	2023-10-10 11:39:59 +02:00
David Green	db32d11a38	[Reassociate] Keep flags for more unchanged operations Reassociation destroys nsw/nuw flags from BinOps that are changed. But if the expression at the end of a tree that was altered, but didn't change itself, the flags do not need to be removed. For example, if %a, %b and %c are reassociated in %x = add nsw i32 %a, %c %y = add nsw i32 %x, %b %z = add nsw i32 %y, %d The value of %y and so add %y %d remains the same, and %z needn't drop the nsw flags. https://alive2.llvm.org/ce/z/_juAiV Differential Revision: https://reviews.llvm.org/D154289	2023-07-03 10:05:40 +01:00
Quentin Colombet	a4e88cba18	[Reassociation] Only form CSE expressions for local operands # TL;DR # This patch constrains how much freedom the heuristic that tries to from CSE expressions has. The added constrain is that the CSE-able expressions must be within the same basic block as the expressions they get moved before. # Details # The reassociation pass currently tweaks the rewrite of the final expression towards surfacing pairs of operands that would be CSE-able. This heuristic applies after the regular ordering of the expression. The regular ordering uses the program structure to choose in which order each subexpression is materialized. That order follows the topological order. Now, to expose more CSE opportunities, this heurisitc effectively bypasses the previous ordering normally defined by the program and pushes up sub-expressions that are arbitrary deep in the CFG. E.g., let's say the program order (top to bottom) gives `((ab)c)d)e` and `be` appears the most in the program. The expression will be reordered in `(((be)a)c)d` This reordering implies that all the sub expressions (in this example `xxa`, then `yy*c`, etc.) will need to appear after the CSE-able expression. This may over-constrain where the (sub) expressions may live and in particular it may create loop-dependent expressions. This patch only allows to move expressions up the expression chain when the related values are definied in the same basic block as the ones they "push-down". This constrain is far for being perfect but at least it avoids accidentally creating loop dependent variables. If we really want to expose CSE-able expressions in a proper way, we would need a profitability metric and also make the decision globally as opposed to one chain at a time. I've put the new constrain behind an option to make comparing the old and new versions easy. However, I believe that even if we find cases where the old version performs better it is probably by accident. What I am aiming for with this change is more predictability, then we can improve if need be. This fixes www.llvm.org/PR61458 Differential Revision: https://reviews.llvm.org/D147457	2023-06-26 11:58:03 +02:00
Kazu Hirata	7b014a0732	[Scalar] Use range-based for loops (NFC)	2023-04-16 09:05:20 -07:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Sanjay Patel	4ca25c66d4	[Reassociate] prevent partial undef negation replacement As shown in the examples in issue #57683, we allow matching vectors with poison (undef) in this transform (and possibly more), but we can't then use the partially defined value as a replacement value in other expressions blindly. This seems to be avoided in simpler examples of reassociation, and other passes should be able to clean up the redundant op seen in these tests.	2022-09-12 12:28:34 -04:00
Nikita Popov	f42d92611d	[Reassociate] Avoid ConstantExpr::getFNeg() (NFCI) Use ConstantFoldUnaryOpOperand() instead. Also make the code below robust against non-instruction users, just in case it doesn't fold.	2022-09-07 10:48:08 +02:00
Nikita Popov	8f3fd26b74	[Reassociate] Use getInsertionPointerAfterDef() This simplifies the code and fixes handling for the callbr case, where the instruction needs to be inserted in the normal destination, rather than after the terminator. Originally part of D129660.	2022-08-31 11:10:24 +02:00
Kazu Hirata	b18ff9c461	[Transform] Use range-based for loops (NFC)	2022-08-27 23:54:32 -07:00
Kazu Hirata	e20d210eef	[llvm] Qualify auto (NFC) Identified with readability-qualified-auto.	2022-08-07 23:55:27 -07:00
Warren Ristow	3bbd380a5b	[Reassociate][NFC] Use an appropriate dyn_cast for BinaryOperator In D129523, it was noted that there is are some questionable naked casts from Instruction to BinaryOperator, which could be addressed by doing a dyn_cast directly to BinaryOperator, avoiding the need for the later cast. This cleans up that casting. Reviewed By: nikic, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D130448	2022-07-25 10:24:43 -07:00
Warren Ristow	3089b411a4	[Reassociate][NFC] Consistent checking for FastMathFlags suitability In D129523, it was noted that the approach to check whether a value can have FastMathFlags was done in different ways, and they should be made consistent. This patch makes minor changes to fix that. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D130408	2022-07-24 17:44:30 -07:00
Warren Ristow	c650793049	[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz' Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as expected, and that enables FP reassociation. Only the two FMF flags 'reassoc' and 'nsz' are technically required to perform reassociation, but disabling other unrelated FMF bits is needlessly suppressing the optimization. This patch fixes that needless suppression, and makes appropriate adjustments to test-cases, fixing some outstanding TODOs in the process. Fixes: #56483 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129523	2022-07-15 11:44:35 -07:00
Warren Ristow	230c8c56f2	[Reassociate] Cleanup minor missed optimizations In analyzing issue #56483, it was noticed that running `opt` with `-reassociate` was missing some minor optimizations. For example, there were cases where the running `opt` on IR with floating-point instructions that have the `fast` flags applied, sometimes resulted in less efficient code than the input IR (things like dead instructions left behind, and missed reassociations). These were sometimes noted in the test-files with TODOs, to investigate further. This commit fixes some of these problems, removing some TODOs in the process. FTR, I refer to these as "minor" missed optimizations, because when running a full clang/llvm compilation, these inefficiencies are not happening, as other passes clean that residue up. Regardless, having cleaner IR produced by `opt`, makes assessing the quality of fixes done in `opt` easier.	2022-07-14 08:21:04 -07:00
Nikita Popov	93cbdaef04	[Reassociate] Avoid ConstantExpr::get() Use ConstantFoldBinaryOpOperands() instead, to handle the case where not all binary ops have a constant expression variant. This is a bit awkward because we only want to pop the element from Ops once we're sure that it has folded.	2022-07-04 15:17:22 +02:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Kazu Hirata	fd7d40640d	[llvm] Use range-based for loops (NFC)	2021-11-28 18:14:49 -08:00
Zarko Todorovski	0d3add216f	[llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts Reworded some comments and asserts to avoid usage of `sanity check/test` Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D114372	2021-11-23 13:22:55 -05:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Sanjay Patel	6fd91be354	[Reassociate] allow or->add with shl operands As discussed in: https://llvm.org/PR49055 We invert instcombine's add->or transform here because it makes it easier to identify factorization transforms like the mul in the motivating test. This extends the logic added with: https://reviews.llvm.org/rG70472f3 https://reviews.llvm.org/rG93f3d7f (I intentionally kept the formatting fix in this patch to provide more context about the calling logic.)	2021-02-07 09:45:19 -05:00
Kazu Hirata	1238378f18	[llvm] Use pop_back_val (NFC)	2021-01-23 10:56:33 -08:00
Kazu Hirata	5d2529f28f	[Scalar] Construct SmallVector with iterator ranges (NFC)	2020-12-28 19:55:18 -08:00
Roman Lebedev	7bf89c2174	[NFC][Reassociate] Delay checking isLoadCombineCandidate() until after ShouldConvertOrWithNoCommonBitsToAdd() but before haveNoCommonBitsSet() This appears to improve -O3 compile-time performance somewhat: https://llvm-compile-time-tracker.com/compare.php?from=87369c626114ae17f4c637635c119e6de0856a9a&to=c04b8271e1609b0dfb20609b40844b0c4324517e&stat=instructions It doesn't look like delaying it until after haveNoCommonBitsSet() is better: https://llvm-compile-time-tracker.com/compare.php?from=c04b8271e1609b0dfb20609b40844b0c4324517e&to=b2943d450eaf41b5f76d2dc7350f0a279f64cd99&stat=instructions	2020-11-18 23:57:12 +03:00
Roman Lebedev	34ff90ad5d	[Reassociate] Don't convert add-like-or's into add's if they appear to be part of load-combining idiom As Wei Mi is reporting in post-commit review https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20201116/853479.html teaching -reassociate about add-like-or's (70472f3) results in breaking apart load widening patterns, and reassociating them. For now, simply exclude any such `or` that appears to be a root of load widening idiom from the or->add transformation. Note that the heuristic is greedy, it doesn't ensure that loads can actually be widened into a single load.	2020-11-18 17:55:02 +03:00
Roman Lebedev	93f3d7f7b3	[Reassociate] Guard `add`-like `or` conversion into an `add` with profitability check This is slightly better compile-time wise, since we avoid potentially-costly knownbits analysis that will ultimately not allow us to actually do anything with said `add`.	2020-11-04 16:10:34 +03:00

1 2 3 4 5 ...

397 Commits