llvm-project

Author	SHA1	Message	Date
annamthomas	54a9f0007c	[SCEV] Fix BinomialCoefficient Iteration to fit in W bits (#88010 ) BinomialCoefficient computes the value of W-bit IV at iteration It of a loop. When W is 1, we can call multiplicative inverse on 0 which triggers an assert since 1b76120. Since the arithmetic is supposed to wrap if It or K does not fit in W bits, do the truncation into W bits after we do the shift. Fixes #87798	2024-04-10 09:02:23 -04:00
Philip Reames	1a37147af5	[SCEV] Match both (-1)b + a and a + (-1)b as a - b (#84247 ) In our analysis of guarding conditions, we were converting a-b == 0 into a == b alternate form, but we were only checking for one of the two forms for the sub. There's no requirement that the multiply only be on the LHS of the add.	2024-03-06 15:57:34 -08:00
Philip Reames	5cd45e442e	[SCEV] Precommit test for widened signed induction variables These tests highlight that we have missed oppurtunities proving trip count bounds when our start/end values are sign extended from smaller types and we have either a loop guard to relate our start vs end, or a nsw/nuw fact to bound end.	2024-03-06 14:09:40 -08:00
Philip Reames	0d38f21e4a	[SCEV] Extend type hint in analysis output to all backedge kinds This extends the work from 7755c26 to all of the different backend taken count kinds that we print for the scev analysis printer. As before, the goal is to cut down on confusion as i4 -1 is a very different (unsigned) value from i32 -1.	2024-03-06 13:08:05 -08:00
Philip Reames	e946b5a87b	[SCEV] Autogenerate more scev analysis check tests	2024-03-06 12:42:19 -08:00
Philip Reames	8b5b294ec2	[SCEV] Print predicate backedge count only if new information available When printing the result of SCEV's analysis, we can avoid printing the predicated backedge taken count and the predicates if the predicates are empty and no new information is provided. This helps to reduce the verbosity of the output.	2024-03-06 10:24:32 -08:00
Philip Reames	7755c26195	[SCEV] Include type when printing constant max backedge taken count When printing the result of the analysis, i8 -1 and i64 -1 are quite different in terms of analysis quality. In a recent conversion with a new contributor, we ran into exactly this confusion. Adding the type for constant scevs more globally seems worthwhile, but introduces a much larger test diff. I'm splitting this off first since it addresses the immediate need, and then going to do some further changes to clarify a few related bits of analysis result output.	2024-03-06 08:48:25 -08:00
Philip Reames	987fe6fa50	[SCEV] Migrate a couple tests to be auto generated A few notes: * pr34538.ll has bitrotten. The original test printed the analysis after transforms in some cases, but this appears to been lost during migration to new pass manager. Remove the now redundant pass invocations and simplify the test setup.	2024-03-05 18:04:30 -08:00
Philip Reames	31c304ba7b	[SCEV] Migrate some tests to be autogenerated In advance of a change which needs to update these. This batch was the "easy" ones, I'll be landing the harder set a few a time for easier review.	2024-03-05 17:41:58 -08:00
Yingwei Zheng	3b70387c54	[ValueTracking] Handle more integer intrinsics in `propagatesPoison` (#82749 ) This patch extends `propagatesPoison` to handle more integer intrinsics. It will turn more logical ands/ors into bitwise ands/ors. See also https://reviews.llvm.org/D99671.	2024-02-23 20:57:56 +08:00
Florian Hahn	c66cedb3a7	[SCEV] Add SCEV analysis tests with congruent IVs. This patch adds a set of tests taken from/llvm/test/Transforms/IndVarSimplify/iv-poison.ll with multiple congruent IVs but different set of flags on the increments. Extra tests for https://github.com/llvm/llvm-project/pull/80430.	2024-02-02 13:05:11 +00:00
Yingwei Zheng	2c2de4b20e	[ValueTracking] Remove SPF support from `computeKnownBitsFromOperator` (#76630 ) This patch removes redundant SPF support (`5350e1b509`) from `computeKnownBitsFromOperator` as we always canonicalize a SPF into an intrinsic call. Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=3dc0638cfc19e140daff7bf1281648daca8212fa&to=8771ef0749fb2ba4304dc68d418c88ec5769346f&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| -0.01%\|-0.01%\|+0.01%\|+0.00%\|+0.01%\|+0.04%\|-0.01%\|	2023-12-31 04:38:18 +08:00
Simon Pilgrim	3736e1d1cd	[SCEV] Ensure shift amount is in range before calling getZExtValue() Fixes #76234	2023-12-22 14:16:54 +00:00
Nikita Popov	90d82412ea	[SCEV] Use loop guards when checking that RHS >= Start (#75039 ) Loop guards tend to provide better results when it comes to reasoning about ranges than isLoopEntryGuardedByCond(). See the test change for the motivating case. I have retained both the loop guard check and the implied cond based check for now, though the latter only seems to impact a single test and only via side effects (nowrap flag calculation) at that.	2023-12-12 09:41:54 +01:00
Nikita Popov	dbee36c523	[SCEV] Add test for unnecessary umax in BECount (NFC)	2023-12-11 12:12:34 +01:00
Nikita Popov	ff0e4fb89a	[SCEV] Use or disjoint flag (#74467 ) Use the disjoint flag to convert or to add instead of calling the haveNoCommonBitsSet() ValueTracking query. This ensures that we can reliably undo add -> or canonicalization, even in cases where the necessary information has been lost or is too complex to reinfer in SCEV. I have updated the bulk of the test coverage to add the necessary disjoint flags in advance.	2023-12-05 17:01:46 +01:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Nikita Popov	88f7dc17eb	[SCEV] Regenerate test checks (NFC) There have been some minor but pervasive changes to the generated CHECK lines, so regenerate all of them, to minimize future diffs.	2023-11-24 15:49:28 +01:00
Nikita Popov	a3eeef82da	[FileCheck] Avoid capturing group for {{regex}} (#72136 ) For `{{regex}}` we don't really need a capturing group, and only add it to properly handle cases like `{{foo\|bar}}`. This is problematic, because the use of capturing groups makes our regex implementation slower (we have to go through the "dissect" stage, which can have quadratic complexity). Unfortunately, our regex implementation does not support non-capturing groups like `(?:regex)`. So instead, avoid adding the group entirely if the regex doesn't contain any alternations. This causes a slight difference in escaping behavior, where previously it was possible to write `{{{{}}` and get the same behavior as `{{\{\{}}`. This will no longer work. I don't think this is a problem, especially as we recently taught update_analyze_test_checks.py to emit `{{\{\{}}`, so this shouldn't get introduced in any new tests. For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our slowest X86 test) this drops FileCheck time from 6s to 5s (the remainder is spent in a different regex issue). I expect similar speedups in other tests using a lot of `{{}}`.	2023-11-14 09:03:54 +01:00
Björn Pettersson	8fc0aca5d1	[SCEV] Support larger than 64-bit types in ashr(add(shl(x, n), c), m) (#71600 ) In commit 5a9a02f67b771fb2edcf06 scalar evolution got support for computing SCEV:s for (ashr(add(shl(x, n), c), m)) constructs. The code however used APInt::getZExtValue without first checking that the APInt would fit inside an uint64_t. When for example using 128-bit types we ended up in assertion failures (or maybe miscompiles in non-assert builds). This patch simply avoid converting from APInt to uint64_t when creating the truncated constant. We can just truncate the APInt instead.	2023-11-08 11:29:12 +01:00
Philip Reames	a7f35d54ee	[SCEV] Extend isImpliedCondOperandsViaRanges to independent predicates (#71110 ) As far as I can tell, there's nothing in this code which actually assumes the two predicates in (FoundLHS FoundPred FoundRHS) => (LHS Pred RHS) are the same. Noticed while investigating something else, this is purely an oppurtunistic optimization while I'm looking at the code. Unfortunately, this doesn't solve my original problem. :)	2023-11-07 07:25:47 -08:00
Philip Reames	5adf6ab7ff	Revert "[IndVars] Generate zext nneg when locally obvious" This reverts commit a6c8e27b3a052913a15a13ee0d4ac466c5ab3f92. It appears likely to have caused https://lab.llvm.org/buildbot/#/builders/57/builds/30988.	2023-11-03 11:19:14 -07:00
Philip Reames	a6c8e27b3a	[IndVars] Generate zext nneg when locally obvious zext nneg was recently added to the IR in #67982. This patch teaches SimplifyIndVars to prefer zext nneg over both sext and plain zext, when a local SCEV query indicates the source is non-negative. The choice to prefer zext nneg over sext looks slightly aggressive here, but probably isn't so much in practice. For cases where we'd "remember" the range fact, instcombine would convert the sext into a zext nneg anyways. The only cases where this produces a different result overall are when SCEV knows a non-local fact, and it doesn't get materialized into the IR. Those are exactly the cases where using zext nneg are most useful. We do run the risk of e.g. a missing combine - since we haven't updated most of them yet - but that seems like a manageable risk. Note that there are much deeper algorithmic changes we could make to this code to exploit zext nneg, but this seemed like a reasonable and low risk starting point.	2023-11-03 09:20:59 -07:00
Philip Reames	015c06ade0	Regenerate a couple scev/indvars tests [nfc] Update to modern output to reduce spurious deltas in upcoming change.	2023-11-03 08:42:59 -07:00
Nikita Popov	a8ac6a9868	[SCEV] Remove newline after predicates in dump update_analyze_test_checks.py will now insert check lines for empty lines, which means that all the existing test coverage will have a spurious change to check for the newline after "Predicates:". I don't think we actually want to have that newline, so drop it before it gets into more test coverage.	2023-11-03 15:43:30 +01:00
Nikita Popov	e4a4122eb6	[IR] Remove zext and sext constant expressions (#71040 ) Remove support for zext and sext constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. There is some additional cleanup that can be done on top of this, e.g. we can remove the ZExtInst vs ZExtOperator footgun. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-03 10:46:07 +01:00
Nikita Popov	4f131b0d22	[IR] Require index width to be ule pointer width (#70015 ) I don't think there is a use case for having an index type that is wider than the pointer type, and I'm not entirely clear what semantics this would even have. Also clarify the GEP semantics to explicitly say how they interact with the index type width.	2023-10-26 10:19:06 +02:00
Nikita Popov	efe4e7a026	[SCEV] Fix incorrect nsw inference for multiply of addrec (#66500 ) SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw. This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw (https://alive2.llvm.org/ce/z/wef9su). Fixes https://github.com/llvm/llvm-project/issues/66066.	2023-09-18 08:23:10 +02:00
Nikita Popov	0e67a68478	[SCEV] Add tests for PR66066 (NFC)	2023-09-15 13:53:11 +02:00
Tyler Lanphear	52240399f9	[AssumptionCache] Track GlobalValues as affected values. (#65425 ) Fixes a corner case of the analysis: previously GlobalValues could be affected by assumptions, but were not tracked within AffectedValues. This patch allows assumptions which affect a given GlobalValue to be looked up via `assumptionsFor()`. A small update to llvm/test/Analysis/ScalarEvolution/ranges.ll was necessary due to knowledge about a global value now being propagated from AssumptionCache -> ValueTracking -> ScalarEvolution.	2023-09-06 15:46:14 -07:00
Tejas Joshi	0609b65aaf	[SCEV] Fix potentially empty set for unsigned ranges The following commit enabled the analysis of ranges for heap allocations: 22ca38da25e19a7c5fcfeb3f22159aba92ec381e The range turns out to be empty in cases such as the one in test (which is [1,1)), leading to an assertion failure. This patch fixes for the same case. Fixes https://github.com/llvm/llvm-project/issues/63856 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D159160	2023-09-04 10:46:53 +01:00
Vedant Paranjape	5a9a02f67b	[SCEV] Compute SCEV for ashr(add(shl(x, n), c), m) instr triplet %x = shl i64 %w, n %y = add i64 %x, c %z = ashr i64 %y, m The above given instruction triplet is seen many times in the generated LLVM IR, but SCEV model is not able to compute the SCEV value of AShr instruction in this case. This patch models the two cases of the above instruction pattern using the following expression: => sext(add(mul(trunc(w), 2^(n-m)), c >> m)) 1) when n = m the expression reduces to sext(add(trunc(w), c >> n)) as n-m=0, and multiplying with 2^0 gives the same result. 2) when n > m the expression works as given above. It also adds several unittest to verify that SCEV is able to compute the value. $ opt sext-add-inreg.ll -passes="print<scalar-evolution>" Comparing the snippets of the result of SCEV analysis: * SCEV of ashr before change ---------------------------- %idxprom = ashr exact i64 %sext, 32 --> %idxprom U: [-2147483648,2147483648) S: [-2147483648,2147483648) Exits: 8 LoopDispositions: { %for.body: Variant } * SCEV of ashr after change --------------------------- %idxprom = ashr exact i64 %sext, 32 --> {0,+,1}<nuw><nsw><%for.body> U: [0,9) S: [0,9) Exits: 8 LoopDispositions: { %for.body: Computable } LoopDisposition of the given SCEV was LoopVariant before, after adding the new way to model the instruction, the LoopDisposition becomes LoopComputable as it is able to compute the SCEV of the instruction. Differential Revision: https://reviews.llvm.org/D152278	2023-08-25 05:42:08 +00:00
Nikita Popov	b9808e5660	[LoopUnroll] Fold add chains during unrolling Loop unrolling tends to produce chains of `%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled iteration. This patch simplifies these adds to `%xN = add %x0, N` directly during unrolling, rather than waiting for InstCombine to do so. The motivation for this is that having a single add (rather than an add chain) on the induction variable makes it a simple recurrence, which we specially recognize in a number of places. This allows InstCombine to directly perform folds with that knowledge, instead of first folding the add chains, and then doing other folds in another InstCombine iteration. Due to the reduced number of InstCombine iterations, this also results in a small compile-time improvement. Differential Revision: https://reviews.llvm.org/D153540	2023-07-05 09:54:28 +02:00
Arthur Eubanks	22ca38da25	[ScalarEvolution] Analyze ranges for heap allocations Followup to D153624. Allows for better exit count calculations for loops checking heap allocations against null. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D154001	2023-06-29 09:35:20 -07:00
Arthur Eubanks	110dfb4c58	[test] Precommit SCEV test	2023-06-29 09:24:37 -07:00
Nikita Popov	3cd4571405	[SCEV] Make use of non-null pointers for range calculation We know that certain pointers (e.g. non-extern-weak globals or allocas in default address space) are not null, in which case the lowest address they can be allocated at is their alignment. This allows us to calculate better exit counts for loops that have an additional null check in the guarding condition (see alloca_icmp_null_exit_count). Differential Revision: https://reviews.llvm.org/D153624	2023-06-29 09:09:17 +02:00
Nikita Popov	407ff50eca	[SCEV] Add test for alloca ranges (NFC)	2023-06-23 14:08:39 +02:00
Nikita Popov	406e9c9372	[SCEV] Use object size for allocas as well The object size and alignment based restriction on the possible allocation range also applies to allocas, not just globals, so handle them as well. We shouldn't really need any type restriction here at all, but for now stay conservative.	2023-06-23 12:38:12 +02:00
Dmitry Makogon	ce1ac1cf18	[SCEV] Don't store AddRec loop when simplifying multiplication of AddRecs When multiplying several AddRecs, we do the following simplification: {A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L> = {x=1 in [ sum y=x..2x [ sum z=max(y-x, y-n)..min(x,n) [ choose(x, 2x)choose(2x-y, x-z)A_{y-z}B_z]] ],+,...up to x=2n} This is done iteratively, pair by pair. So if we try to multiply three AddRecs A1, A2, A3, then we'd try to simplify A1 A2 to A1' and then try to simplify A1' * A3 if A1' is also an AddRec. The transform is only legal if the loops of the two AddRecs are the same. It is checked in the code, but the loop of one of the AddRecs is stored in a local variable and doesn't get updated when we simplify a pair to a new AddRec. In the motivating test the new AddRec A1' was created for a different loop and, as the loop variable didn't get updated, the check for different loops passed and the transform worked for two AddRecs from different loops. So it created a wrong SCEV. And it caused LSR to replace an instruction with another one that had the same SCEV as the incorrectly computed one. Differential Revision: https://reviews.llvm.org/D153254	2023-06-22 15:49:15 +07:00
Dmitry Makogon	6826d3c513	[Test] Add test for PR62430 showing bug in SCEV mul expression creation (NFC)	2023-06-19 15:50:23 +07:00
Florian Hahn	2ba78229e4	[SCEV] Try smaller ZExts when using loop guard info. If we didn't find the extact ZExt expr in the rewrite map, check if there's an entry for a smaller ZExt we can use instead. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149786	2023-06-09 20:05:50 +01:00
Nikita Popov	c1aa0dce48	[SCEV] Remove -verify-scev-maps flag This is now checked as part of the usual SCEV verification. There is little value in checking this on each lookup. These two maps are strictly synchronized nowadays, which was not the case historically.	2023-06-09 11:51:53 +02:00
Nikita Popov	dfb369399d	[ValueTracking] Directly use KnownBits shift functions Make ValueTracking directly call the KnownBits shift helpers, which provides more precise results. Unfortunately, ValueTracking has a special case where sometimes we determine non-zero shift amounts using isKnownNonZero(). I have my doubts about the usefulness of that special-case (it is only tested in a single unit test), but I've reproduced the special-case via an extra parameter to the KnownBits methods. Differential Revision: https://reviews.llvm.org/D151816	2023-06-01 09:46:16 +02:00
Joshua Cao	6ed152aff4	[SCEV] Compute AddRec range computations using different type BECount Before this patch, we can only use the MaxBECount for an AddRec's range computation if the MaxBECount has <= bit width of the AddRec. This patch reasons that if a MaxBECount has > bit width, and is <= the max value of AddRec's bit width, we can still use the MaxBECount. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151698	2023-05-31 21:05:17 -07:00
Nikita Popov	0c23dc20bc	Reapply [SCEV] Replace IsAvailableOnEntry with block disposition This exposed an issue in SCEVExpander/LCSSA, which has been fixed in D150681. ----- As far as I understand, the IsAvailableOnEntry() function basically implements the same functionality as the properlyDominates() block disposition. The primary difference (apart from a weaker implementation) seems to be in this comment at the top: // Checks if the SCEV S is available at BB. S is considered available at BB // if S can be materialized at BB without introducing a fault. However, I don't really understand why there would be such a requirement. It's my understanding that SCEV explicitly does not care about trapping udiv instructions itself, and it's the job of SCEVExpander's isSafeToExpand() to make sure these don't get expanded if they may trap. Differential Revision: https://reviews.llvm.org/D149344	2023-05-25 10:02:18 +02:00
Nikita Popov	f7d1baa414	[KnownBits] Return zero instead of unknown for always poison shifts For always poison shifts, any KnownBits return value is valid. Currently we return unknown, but returning zero is generally more profitable. We had some code in ValueTracking that tried to do this, but was actually dead code. Differential Revision: https://reviews.llvm.org/D150648	2023-05-23 14:41:22 +02:00
Dmitry Makogon	2bb3515152	[SCEV] Replace NumTripCountsComputed stat with NumExitCountsComputed This fixes assertion crash in https://github.com/llvm/llvm-project/issues/62380. In the beginning of ScalarEvolution::getBackedgeTakenInfo we make sure that BackedgeTakenCounts contains an entry for the given loop. Then we call computeBackedgeTakenCount which computes the result, and in the end we insert it in the map like so: return BackedgeTakenCounts.find(L)->second = std::move(Result); So we expect that the entry for L still exists in the cache. However, it can get deleted. When it has computed the result, getBackedgeTakenInfo clears all the cached SCEVs that use the AddRecs in the loop. In the crashing example, getBackedgeTakenInfo first gets called on an inner loop, and during this call it gets called again on its parent loop. This recursion happens after the call to computeBackedgeTakenCount. And it happens so that some SCEV from the BTI of the child loop uses an AddRec of the parent loop. So when we successfully compute BTI for the parent loop, we erase already computed result for the child one. The recursion happens in some debug only code that updates statistics. The algorithm itself is non-recursive. Namely the recursive call happens in BackedgeTakenInfo::getExact function and its return value is only used to compare it against SCEVCouldNotCompute. As suggested by nikic I replaced the NumTripCountsComputed and NumTripCountsNotComputed with NumExitCountsComputed and NumExitCountsNotComputed respectively. They are updated during computations made for single exits. It relieves us of the need to compute exact exit count for the loop just to update the named statistic and thus the recursion cannot happen anymore. Differential Revision: https://reviews.llvm.org/D149251	2023-05-22 20:10:51 +07:00
Nikita Popov	b38bd86077	[SCEV] Regenerate test checks (NFC)	2023-05-16 11:33:21 +02:00
Manoj Gupta	9fb9c7776e	Revert "[SCEV] Replace IsAvailableOnEntry with block disposition" This reverts commit 103fc0f629aa6218783f65dff0197f257137cade. Causes a clang crash in ChromeOS builds. Testcase provided at D149344.	2023-05-10 09:57:48 -07:00
Joshua Cao	9c1d5e4ae3	[SCEV][reland] More precise trip multiples We currently have getMinTrailingZeros(), from which we can get a SCEV's multiple by computing 1 << MinTrailingZeroes. However, this only gets us multiples that are a power of 2. This patch introduces a way to get max constant multiples that are not just a power of 2. The logic is similar to that of getMinTrailingZeros. getMinTrailingZerosImpl is replaced by computing the max constant multiple, and counting the number of trailing bits. I have so far found this useful in two places: 1) Computing unsigned constant ranges. For example, if we have i8 {10,+,10}<nuw>, we know the max constant it can be is 250. 2) My original intent was to use this in getSmallConstantTripMultiples, but it has no effect right now due to change from D110587. For example, if we have backedge count `(6 * %N) - 1`, the trip count becomes `1 + zext((6 * %N) - 1)`, and we cannot say that 6 is a multiple of the SCEV. I plan to look further into this separately. The implementation assumes the value is unsigned. It can probably be extended to handle signed values as well. If the code sees that a SCEV does not have <nuw>, it will fall back to finding the max multiple that is a power of 2. Multiples that are a power of 2 will still be a multiple even after the SCEV overflows. This does not apply to other values. This is the 1st commit message: --- This relands https://reviews.llvm.org/D141823. The verification fails when expensive checks are turned on. This can occur when: 1. SCEV S's multiple is cached 2. SCEV S's no wrap flags are strengthened, and the multiple changes 3. SCEV verifier finds that S's cached and recomputed multiple are different We eliminate most cases by forgetting SCEVAddRecExpr's cached values when the flags are modified, but there are still cases for other SCEV types. We relax the check by making sure the cached multiple divides the recomputed multiple, ensuring the cached multiple is correct, conservative multiple. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D149529	2023-05-07 22:01:04 -07:00

1 2 3 4 5 ...

888 Commits