llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	c72d93a08a	[Attributor][NFC] Remove unnecessary overwritten methods	2022-07-21 21:57:02 -05:00
Chenbing Zheng	1a0187c9e7	[InstCombine] remove useless ‘InstCombiner::’. nfc Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130220	2022-07-22 09:24:24 +08:00
Philip Reames	bd75350180	[LV] Fix a conceptual mistake around meaning of uniform in isPredicatedInst This code confuses LV's "Uniform" and LVL/LAI's "Uniform". Despite the common name, these are different. * LVs notion means that only the first lane of each unrolled part is required. That is, lanes within a single unroll factor are considered uniform. This allows e.g. widenable memory ops to be considered uses of uniform computations. * LVL and LAI's notion refers to all lanes across all unrollings. IsUniformMem is in turn defined in terms of LAI's notion. Thus a UniformMemOpmeans is a memory operation with a loop invariant address. This means the same address is accessed in every iteration. The tweaked piece of code was trying to match a uniform mem op (i.e. fully loop invariant address), but instead checked for LV's notion of uniformity. In theory, this meant with UF > 1, we could speculate a load which wasn't safe to execute. This ends up being mostly silent in current code as it is nearly impossible to create the case where this difference is visible. The closest I've come in the test case from 54cb87, but even then, the incorrect result is only visible in the vplan debug output; before this change we sink the unsafely speculated load back into the user's predicate blocks before emitting IR. Both before and after IR are correct so the differences aren't "interesting". The other test changes are uninteresting. They're cases where LV's uniform analysis is slightly weaker than SCEV isLoopInvariant.	2022-07-21 15:44:34 -07:00
Alexander Shaposhnikov	e9afdf838e	[GlobalOpt] Enable evaluation of atomic loads Relax the check to allow evaluation of atomic loads (but still skip volatile loads). Test plan: 1/ ninja check-llvm check-clang 2/ Bootstrapped LLVM/Clang pass tests Differential revision: https://reviews.llvm.org/D130211	2022-07-21 21:36:11 +00:00
Augie Fackler	bd6aa67e02	BuildLibCalls: move inference of freeing memory later This probably should have been part of D123089, but the effects of it don't show up until we start removing functions from the table in D130107. Oops. Differential Revision: https://reviews.llvm.org/D130184	2022-07-21 15:31:16 -04:00
Sanjay Patel	78c09f0f24	[PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression The InstCombine test is reduced from issue #56601. Without the more liberal match for ConstantExpr, we try to rearrange constants in Negator forever. Alternatively, we could adjust the definition of m_ImmConstant to be more conservative, but that's probably a larger patch, and I don't see any downside to changing m_ConstantExpr. We never capture and modify a ConstantExpr; transforms just want to avoid it. Differential Revision: https://reviews.llvm.org/D130286	2022-07-21 15:23:57 -04:00
David Sherwood	f15b6b2907	[AArch64] Add target hook for preferPredicateOverEpilogue This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following conditions (non-exhaustive) is met: 1. The "sve-tail-folding" option is set to "all", or 2. The "sve-tail-folding" option is set to "all+noreductions" and the loop does not contain reductions, 3. The "sve-tail-folding" option is set to "all+norecurrences" and the loop has no first-order recurrences. Currently the default option is "disabled", but this will be changed in a later patch. I've added new tests to show the options behave as expected here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll Differential Revision: https://reviews.llvm.org/D129560	2022-07-21 17:20:06 +01:00
Nikita Popov	1f69503107	[MemoryBuiltins] Add getReallocatedOperand() function (NFC) Replace the value-accepting isReallocLikeFn() overload with a getReallocatedOperand() function, which returns which operand is the one being reallocated. Currently, this is always the first one, but once allockind(realloc) is respected, the reallocated operand will be determined by the allocptr parameter attribute.	2022-07-21 14:54:16 +02:00
Nikita Popov	46e6dd84b7	[MemoryBuiltins] Remove isFreeCall() function (NFC) Remove isFreeCall() in favor of getFreedOperand(). Replace the two remaining uses with a getFreedOperand() != nullptr check, as they only care that something is getting freed. (The usage in DSE is correct as such. The allocator-related checks in CFLGraph look rather questionable in general.)	2022-07-21 14:44:23 +02:00
Nikita Popov	5e856a8578	[InstCombine] Use getFreedOperand() (NFC) Use getFreedOperand() instead of isFreeCall() to remove the implicit assumption that any pointer operand to a free function is the operand being freed. This won't actually matter until we handle allockind(free).	2022-07-21 14:33:55 +02:00
Nikita Popov	3ac8587a2b	[Attributor] Use getFreedOperand() (NFC) Track which operand is actually freed, to avoid the implicit assumption that it is the first call argument.	2022-07-21 14:26:47 +02:00
Nikita Popov	c81dff3c30	[MemoryBuiltins] Add getFreedOperand() function (NFCI) We currently assume in a number of places that free-like functions free their first argument. This is true for all hardcoded free-like functions, but with the new attribute-based design, the freed argument is supposed to be indicated by the allocptr attribute. To make sure we handle this correctly once allockind(free) is respected, add a getFreedOperand() helper which returns the freed argument, rather than just indicating whether the call frees some argument. This migrates most but not all users of isFreeCall() to the new API. The remaining users are a bit more tricky.	2022-07-21 12:39:35 +02:00
Nikita Popov	8d58c8e57b	Reapply [InstCombine] Don't check for alloc fn before fetching alloc size Reapply the patch with getObjectSize() replaced by getAllocSize(). The former will also look through calls that return their argument, and we'll end up placing dereferenceable attributes on intrinsics like llvm.launder.invariant.group. While this isn't wrong, it also doesn't seem to be particularly useful. For now, use getAllocSize() instead, which sticks closer to the original behavior of this code. ----- This code is just interested in the allocsize, not any other allocator properties.	2022-07-21 11:48:24 +02:00
Nikita Popov	70056d04e2	Revert "[InstCombine] Don't check for alloc fn before fetching object size" This reverts commit c72c22c04df992c95c5912d0075e5263c88f9fec. This affected an Analysis test that I missed. Reverting for now.	2022-07-21 10:59:12 +02:00
Nikita Popov	c72c22c04d	[InstCombine] Don't check for alloc fn before fetching object size This code is just interested in the allocsize, not any other allocator properties.	2022-07-21 10:45:03 +02:00
Nikita Popov	f45ab43332	[MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc Alloc directly checking whether a given call is a removable allocation, instead of first checking whether it is an allocation first.	2022-07-21 09:39:19 +02:00
Chenbing Zheng	8c124c9088	[InstCombine] (ShiftValC >> Y) >s -1/<s 0 --> Y != 0/==0 We can do folds (ShiftValC >> Y) >s -1 --> Y != 0 and (ShiftValC >> Y) <s 0 --> Y == 0, with ShiftValC < 0. Alive2: https://alive2.llvm.org/ce/z/-PRHfD Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129726	2022-07-21 10:12:29 +08:00
Chenbing Zheng	8075f680c8	[InstCombine] add fold (X > C - 1) ^ (X < C + 1) --> X != C Considering the correctness of this pattern, we should avoid that C - 1 is non-negative and C + 1 is negative. Alive2: https://alive2.llvm.org/ce/z/c_rBaq Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129622	2022-07-21 10:08:21 +08:00
Johannes Doerfert	ad98ef8be4	[Attributor] Deal with complex PHI nodes better during AAPointerInfo We were quite conservative when it came to PHI node handling to avoid recursive reasoning. Now we check more direct if we have seen a PHI already or not. This allows non-recursive PHI chains to be handled. This also exposed a bug as we did only model the effect of one loop traversal. `phi_no_store_3` has been adapted to show how we would have used `undef` instead of `1` before. With this patch we don't replace it at all, which is expected as we do not argue about loop iterations (or alignments).	2022-07-20 17:34:50 -05:00
Johannes Doerfert	142897dd7d	[Attributor] Only non-exact accesses require a uniform bit-pattern (=0) If we only have exact accesses we should never require the bit-pattern to be uniform (in this case 0). Only a non-exact access should force us to require only 0 values.	2022-07-20 17:34:50 -05:00
Alexander Shaposhnikov	67f1fe8597	[GlobalOpt] Enable evaluation of atomic stores Relax the check to allow evaluation of atomic stores (but still skip volatile stores). Test plan: 1/ ninja check-llvm check-clang 2/ Bootstrapped LLVM/Clang pass tests Differential revision: https://reviews.llvm.org/D129841	2022-07-20 22:33:58 +00:00
Schrodinger ZHU Yifan	304027206c	[ThinLTO] Support aliased GlobalIFunc Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is aliased in LTO, clang will attempt to create an alias summary; however, as ifunc is not included in the module summary, doing so will lead to crash. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129009	2022-07-20 15:30:38 -07:00
Craig Topper	d76c8f5127	[InstCombine] Add mul with negated power of 2 constant to canEvaluateShifted. If we are right shifting a multiply by a negated power of 2 where the power of 2 is the same as the shift amount, we can replace with a negate followed by an And. New tests have not been committed yet but the patch shows the diffs. Let me know if you want any changes or additional tests. Differential Revision: https://reviews.llvm.org/D130103	2022-07-20 11:00:22 -07:00
Ruobing Han	2b98b8e8fb	fix bug for useless malloc elimination in CodeGenPrepare Put AllocationFn check before I->willReturn can allow CodeGenPrepare to remove useless malloc instruction Differential Revision: https://reviews.llvm.org/D130126	2022-07-20 16:29:51 +00:00
Philip Reames	523a526a02	[LV] Fix miscompile due to srem/sdiv speculation safety condition An srem or sdiv has two cases which can cause undefined behavior, not just one. The existing code did not account for this, and as a result, we miscompiled when we encountered e.g. a srem i64 %v, -1 in a conditional block. Instead of hand rolling the logic, just use the utility function which exists exactly for this purpose. Differential Revision: https://reviews.llvm.org/D130106	2022-07-20 05:35:23 -07:00
Nicolai Hähnle	1ddc51d89d	Inliner: don't mark call sites as 'nounwind' if that would be redundant When F calls G calls H, G is nounwind, and G is inlined into F, then the inlined call-site to H should be effectively nounwind so as not to lose information during inlining. If H itself is nounwind (which often happens when H is an intrinsic), we no longer mark the callsite explicitly as nounwind. Previously, there were cases where the inlined call-site of H differs from a pre-existing call-site of H in F only in the explicitly added nounwind attribute, thus preventing common subexpression elimination. v2: - just check CI->doesNotThrow v3 (resubmit after revert at 344378808778c61d5599f4e0ac783ef7e6f8ed05): - update Clang tests Differential Revision: https://reviews.llvm.org/D129860	2022-07-20 14:17:23 +02:00
Florian Hahn	5124b21648	[VPlan] Initial def-use verification. This patch introduces some initial def-use verification. This catches cases like the one fixed by D129436. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D129717	2022-07-20 11:06:32 +01:00
Fangrui Song	e931c2e870	[LegacyPM] Remove InstrOrderFileLegacyPass Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-19 23:58:51 -07:00
Kazu Hirata	0387da6f4f	Use value instead of getValue (NFC)	2022-07-19 21:18:26 -07:00
Kazu Hirata	41ae78ea3a	Use has_value instead of hasValue (NFC)	2022-07-19 20:15:44 -07:00
Johannes Doerfert	f84712f0b8	[Attributor] Teach checkForAllUses to follow returns into callers If we can determine all call sites we can follow a use in a return instruction into the caller. AAPointerInfo utilizes this feature.	2022-07-19 18:17:40 -05:00
Johannes Doerfert	4f2ccdd0b1	[Attributor][NFC] Improve debug messages	2022-07-19 18:17:40 -05:00
Nick Desaulniers	1cf6b93df1	Revert "[Local] Allow creating callbr with duplicate successors" This reverts commit 08860f525a2363ccd697ebb3ff59769e37b1be21. Crashes during PPC64LE linux kernel builds as reported by @nathanchance. https://reviews.llvm.org/D129997#3663632	2022-07-19 15:03:27 -07:00
Johannes Doerfert	bf789b1957	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in 6555558a80589d1c5a1154b92cc3af9495f8f86c.	2022-07-19 16:24:42 -05:00
Arthur Eubanks	13aa2c1c3b	[DSE] Revisit pointers that may no longer escape after removing another store In dependent-capture, previously we'd see that %tmp4 is captured due to the first store. We'd cache this info in CapturedBeforeReturn and InvisibleToCallerAfterRet. Then the first store is then removed, causing the cached values to be wrong. We also need to revisit everything because normally we work backwards when removing stores at the end of the function, but in this case removing an earlier store causes a later store to be removable. No compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=56796ae1a8db4c85dada28676f8303a5a3609c63&to=21b7e5248ffc423cd36c9d4a020085e363451465&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D123686	2022-07-19 09:30:34 -07:00
Sanjay Patel	3d6c10dcf3	[SimplifyLibCalls] avoid converting pow() to powi() with no FMF powi() is not a standard math library function; it is specified with non-strict semantics in the LangRef. We currently require 'afn' to do this transform when it needs a sqrt(), so I just extended that requirement to the whole-number exponent too. This bug was introduced with: b17754bcaa14 ...where we deferred expansion of pow() to later passes.	2022-07-19 12:26:53 -04:00
Arnold Schwaighofer	bc4870f09e	[coro async] Add missing llvm.coro.id.async intrinsic to declaresCoroCleanupIntrinsics rdar://97214593 Differential Revision: https://reviews.llvm.org/D130038	2022-07-19 07:25:04 -07:00
Andrew Turner	b850762b62	Add the FreeBSD AArch64 memory layout Use the FreeBSD AArch64 memory layout values when building for it. These are based on the x86_64 values, scaled to take into account the larger address space on AArch64. Reviewed by: vitalybuka Differential Revision: https://reviews.llvm.org/D125883	2022-07-19 09:58:07 -04:00
Andrew Turner	e13bd2644e	Add the FreeBSD AArch64 shadow offset to llvm AArch64 has a larger address space than 64 but x86. Use the larger shadow offset on FreeBSD AArch64. Reviewed by: vitalybuka Differential Revision: https://reviews.llvm.org/D125873	2022-07-19 09:58:07 -04:00
William Schmidt	bccc9aa81c	Don't vectorize PHIs in catchswitch blocks We currently assert in vectorizeTree(TreeEntry*) when processing a PHI bundle in a block containing a catchswitch. We attempt to set the IRBuilder insertion point following the catchswitch, which is invalid. This is done so that ShuffleBuilder.finalize() knows where to insert a shuffle if one is needed. To avoid this occurring, watch out for catchswitch blocks during buildTree_rec() processing, and avoid adding PHIs in such blocks to the vectorizable tree. It is unlikely that constraining vectorization over an exception path will cause a noticeable performance loss, so this seems preferable to trying to anticipate when a shuffle will and will not be required.	2022-07-19 06:10:17 -07:00
Nikita Popov	08860f525a	[Local] Allow creating callbr with duplicate successors Since D129288, callbr is allowed to have duplicate successors. This patch removes a limitation which prevents optimizations from actually producing such callbrs. Differential Revision: https://reviews.llvm.org/D129997	2022-07-19 14:28:22 +02:00
Florian Hahn	a75760a269	[LV] Remove unnecessary cast in widenCallInstruction. (NFC)	2022-07-19 11:23:24 +01:00
Max Kazantsev	82309831c3	[LoopSimplifyCFG] Prevent use-def dominance breach by handling dead exits. PR56243 One of the transforms in LoopSimplifyCFG demands that the LCSSA form is truly maintained for all values, tokens included, otherwise it may end up creating a use that is not dominated by def (and Phi creation for tokens is impossible). Detect this situation and prevent transform for it early. Differential Revision: https://reviews.llvm.org/D129984 Reviewed By: efriedma	2022-07-19 15:54:12 +07:00
Ellis Hoag	3580daacf3	[InstrProf] Allow CSIRPGO function entry coverage The flag `-fcs-profile-generate` for enabling CSIRPGO moves the pass `pgo-instrumentation` after inlining. Function entry coverage works fine with this change, so remove the assert. I had originally left this assert in because I had not tested this at the time. Reviewed By: davidxl, MaskRay Differential Revision: https://reviews.llvm.org/D129407	2022-07-18 15:10:11 -07:00
Florian Hahn	30e53b8c03	[LV] Sink module variable and use State to set it in widenCall. (NFC) Limits the lifetime of the variable and makes it independent of CallInst.	2022-07-18 19:41:48 +01:00
Arnold Schwaighofer	28ebd13d63	[coro async] Fix code to run coro.async.end cleanup like the legacy pass did The code executed for the Switch ABI does not change. rdar://97074714 Differential Revision: https://reviews.llvm.org/D129865	2022-07-18 10:41:29 -07:00
Nicolai Hähnle	3443788087	Revert "Inliner: don't mark call sites as 'nounwind' if that would be redundant" This reverts commit 9905c379819fafdc2246bcd24dd7165bd72d7659. Looks like there are Clang changes that are affected in trivial ways. Will look into it.	2022-07-18 17:43:35 +02:00
Nicolai Hähnle	9905c37981	Inliner: don't mark call sites as 'nounwind' if that would be redundant When F calls G calls H, G is nounwind, and G is inlined into F, then the inlined call-site to H should be effectively nounwind so as not to lose information during inlining. If H itself is nounwind (which often happens when H is an intrinsic), we no longer mark the callsite explicitly as nounwind. Previously, there were cases where the inlined call-site of H differs from a pre-existing call-site of H in F only in the explicitly added nounwind attribute, thus preventing common subexpression elimination. v2: - just check CI->doesNotThrow Differential Revision: https://reviews.llvm.org/D129860	2022-07-18 17:28:52 +02:00
Sanjay Patel	26fbb79c33	[InstCombine] reduce code for signbit folds; NFC	2022-07-18 11:04:58 -04:00
Nikita Popov	21e2f133a8	[LoopSimplifyCFG] Revert accidental change This change was included in an unrelated change b57d61384c9938e3dfa54b55bf8b2a0a05e67e28 and was of course not intended for commit...	2022-07-18 15:30:13 +02:00

1 2 3 4 5 ...

31103 Commits