llvm-project

Author	SHA1	Message	Date
OCHyams	4ece50737d	[Assignment Tracking][NFC] Replace LLVM command line option with a module flag Remove LLVM flag -experimental-assignment-tracking. Assignment tracking is still enabled from Clang with the command line -Xclang -fexperimental-assignment-tracking which tells Clang to ask LLVM to run the pass declare-to-assign. That pass converts conventional debug intrinsics to assignment tracking metadata. With this patch it now also sets a module flag debug-info-assignment-tracking with the value `i1 true` (using the flag conflict rule `Max` since enabling assignment tracking on IR that contains only conventional debug intrinsics should cause no issues). Update the docs and tests too. Reviewed By: CarlosAlbertoEnciso Differential Revision: https://reviews.llvm.org/D142027	2023-01-20 14:24:15 +00:00
Nikita Popov	d49b842ea2	[SROA] Use copyMetadataForLoad() helper Instead of copying just nonnull metadata, use the generic helper to copy metadata to the new load. This helper is specifically designed for the case where the load type may change, so it's safe to use in this context.	2023-01-20 15:24:10 +01:00
Nikita Popov	bf23b4031e	[ValueTracking] Take poison-generating metadata into account (PR59888) In canCreateUndefOrPoison(), take not only poison-generating flags, but also poison-generating metadata into account. The helpers are written generically, but I believe the only case that can actually matter is !range on calls -- !nonnull and !align are only valid on loads, and those can create undef/poison anyway. Unfortunately, this negatively impacts logical to bitwise and/or conversion: For ctpop/ctlz/cttz we always attach !range metadata, which will now block the transform, because it might introduce poison. It would be possible to recover this regression by supporting a ConsiderFlagsAndMetadata=false mode in impliesPoison() and clearing flags/metadata on visited instructions. Fixes https://github.com/llvm/llvm-project/issues/59888. Differential Revision: https://reviews.llvm.org/D142115	2023-01-20 12:18:32 +01:00
Sergey Kachkov	e1a702db2f	[GVN] Refactor findDominatingLoad function Improve findDominatingLoad implementation: 1. Result is saved into gvn::AvailableValue struct 2. Search is done in extended BB (while there is a single predecessor or limit is reached) Differential Revision: https://reviews.llvm.org/D141680	2023-01-20 11:54:11 +03:00
Arthur Eubanks	c5ea42bcf4	Revert "[LoopUnroll] Directly update DT instead of DTU." This reverts commit d0907ce7ed9f159562ca3f4cfd8d87e89e93febe. Causes `opt -passes=loop-unroll-full` to crash on ``` define void @foo() { bb: br label %bb1 bb1: ; preds = %bb1, %bb1, %bb switch i1 true, label %bb1 [ i1 true, label %bb2 i1 false, label %bb1 ] bb2: ; preds = %bb1 ret void } ```	2023-01-19 17:01:15 -08:00
Alexey Bataev	9bdcf8778a	[SLP]Improve isGatherShuffledEntry by looking deeper through the reused scalars. The compiler may produce better results if it does not look for constants, uses an extra analysis of phi nodes, looks through all tree nodes without skipping the cases, where the very first set of nodes is empty. Also, it tries to reshufle the nodes if it is profitable for sure, i.e. at least 2 scalars are used for single node permutation and at least 3 scalars are used for the permutation of 2 nodes. Part of D110978 Differential Revision: https://reviews.llvm.org/D141512	2023-01-19 13:46:25 -08:00
Florian Hahn	e2c43a547b	[VPlan] Add vp_depth_first_deep (NFC) Similar to vp_depth_first_shallow (D140512) add vp_depth_first_deep to make existing code clearer and more compact. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142055	2023-01-19 20:34:23 +00:00
Arthur Eubanks	1f3f3c0ea7	Revert "Reland [pgo] Avoid introducing relocations by using private alias" This reverts commit da5a8d14b8cc6cea16ee0929413c0672b47c93d9. Causes more duplicate symbol errors, see https://bugs.chromium.org/p/chromium/issues/detail?id=1408161.	2023-01-19 10:20:38 -08:00
Florian Hahn	d0907ce7ed	[LoopUnroll] Directly update DT instead of DTU. The scope of DT updates are very limited when unrolling loops: the DT should only need updating for * new blocks added * exiting blocks we simplified branches This can be done manually without too much extra work. MergeBlockIntoPredecessor also needs to be updated to support direct DT updates. This fixes excessive time spent in DTU for same cases. In an internal example, time spent in LoopUnroll with this patch goes from ~200s to 2s. It also is slightly positive for CTMark: * NewPM-O3: -0.13% * NewPM-ReleaseThinLTO: -0.11% * NewPM-ReleaseLTO-g: -0.13% Notable improvements are mafft (~ -0.50%) and lencod (~ -0.30%), with no workload regressed. https://llvm-compile-time-tracker.com/compare.php?from=78a9ee7834331fb4360457cc565fa36f5452f7e0&to=687e08d011b0dc6d3edd223612761e44225c7537&stat=instructions:u Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D141487	2023-01-19 18:11:54 +00:00
Nikita Popov	b3b049a824	[Local] Preserve noundef metadata in copyMetadataForLoad() If we're only changing the type of the load, preserve the noundef metadata.	2023-01-19 16:56:09 +01:00
Christian Ulmann	e741b8c2e5	[llvm][ir] Purge MD_prof custom accessors This commit purges direct accesses to MD_prof metadata and replaces them with the accessors provided from the utility file wherever possible. This commit can be seen as the first step towards switching the branch weights to 64 bits. See post here: https://discourse.llvm.org/t/extend-md-prof-branch-weights-metadata-from-32-to-64-bits/67492 Reviewed By: davidxl, paulkirth Differential Revision: https://reviews.llvm.org/D141393	2023-01-19 14:26:26 +01:00
Florian Hahn	655c88ca36	[VPlan] Add vp_depth_first_shallow + graph traits for wrapper(NFC) This patch adds a new VPBlockShallowTraversalWrapper struct to provide graph traits specialization that do not traverse through VPRegionBlocks. This matches the behavior of the existing traits for plain VPBlockBase and is a step before moving the graph traits for VPBlockBase to traverse through VPRegionBlocks to enable cross region support in VPDominatorTree. Depends on D140511. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D140512	2023-01-19 12:07:27 +00:00
Quentin Colombet	6b85fa6d81	[InstCombine] Don't optimize idempotent `atomicrmw <op>, 0` into `load atomic` Turning idempotent `atomicrmw`s into `load atomic` is perfectly legal with respect to how the loading happens, but it may not be legal for the whole program semantic. Indeed, this optimization removes a store that may have some effects on the legality of other optimizations. Essentially, we lose some information and depending on the backend it may or may not produce incorrect code, so don't do it! This fixes llvm.org/PR56450. Differential Revision: https://reviews.llvm.org/D141277	2023-01-19 10:04:07 +01:00
Kazu Hirata	83d56fb17a	Drop the ZeroBehavior parameter from countLeadingZeros and the like (NFC) This patch drops the ZeroBehavior parameter from bit counting functions like countLeadingZeros. ZeroBehavior specifies the behavior when the input to count{Leading,Trailing}Zeros is zero and when the input to count{Leading,Trailing}Ones is all ones. ZeroBehavior was first introduced on May 24, 2013 in commit eb91eac9fb866ab1243366d2e238b9961895612d. While that patch did not state the intention, I would guess ZeroBehavior was for performance reasons. The x86 machines around that time required a conditional branch to implement countLeadingZero<uint32_t> that returns the 32 on zero: test edi, edi je .LBB0_2 bsr eax, edi xor eax, 31 .LBB1_2: mov eax, 32 That is, we can remove the conditional branch if we don't care about the behavior on zero. IIUC, Intel's Haswell architecture, launched on June 4, 2013, introduced several bit manipulation instructions, including lzcnt and tzcnt, which eliminated the need for the conditional branch. I think it's time to retire ZeroBehavior as its utility is very limited. If you care about compilation speed, you should build LLVM with an appropriate -march= to take advantage of lzcnt and tzcnt. Even if not, modern host compilers should be able to optimize away quite a few conditional branches because the input is often known to be nonzero from dominating conditional branches. Differential Revision: https://reviews.llvm.org/D141798	2023-01-18 19:58:44 -08:00
Jonas Paulsson	dc3875e468	Add parameter extension attributes in various instrumentation passes. For the targets that have in their ABI the requirement that arguments and return values are extended to the full register bitwidth, it is important that calls when built also take care of this detail. The OMPIRBuilder, AddressSanitizer, GCOVProfiling, MemorySanitizer and ThreadSanitizer passes are with this patch hopefully now doing this properly. Reviewed By: Eli Friedman, Ulrich Weigand, Johannes Doerfert Differential Revision: https://reviews.llvm.org/D133949	2023-01-18 18:29:12 -06:00
Paul Kirth	da5a8d14b8	Reland [pgo] Avoid introducing relocations by using private alias In many cases, we can use an alias to avoid a symbolic relocations, instead of using the public, interposable symbol. When the instrumented function is in a COMDAT, we can use a hidden alias, and still avoid references to discarded sections. Previous versions of this patch allowed the compiler to name the generated alias, but that would only be valid when the functions were local. Since the alias may be used across TUs we use a more deterministic naming convention, and add a ".local" suffix to the alias name just as we do for relative vtables aliases. https://reviews.llvm.org/rG20894a478da224bdd69c91a22a5175b28bc08ed9 removed an incorrect assertion on Mach-O which caused assertion failures in LLD. We addressed the link errors under ThinLTO + PGO + CFI by being more selective about which comdat functions can be given aliases. Specifically, we now do not emit an alias in the case of a comdat function with hidden visibility, since the alias would have the same linkage and visibility, giving no benefit over using the symbol directly. This also prevents LowerTypeTest from incorrectly updating the dangling alias after GlobalOpt replaces uses, and introducing a duplicate symbol. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D137982	2023-01-18 23:56:35 +00:00
Sanjay Patel	1378e7d8b8	[InstSimplify] add no-wrap parameters to simplifyMul and add more tests; NFC This gives mul the same capabilities as add/sub. A potential improvement with nsw was noted in: 1720ec6da040729f17	2023-01-18 13:29:30 -05:00
Sanjay Patel	1720ec6da0	[InstCombine] restrict no-wrap propagation for i1/i2 to avoid miscompiles This transform was added with 68c197f07eeae71b9b7, and the post-commit review noted the potential for miscompiles at narrow bitwidths. I'm not sure how to expose the i1 nuw bug because we already simplify that, but other cases show that there are missing transforms to add in follow-up patches.	2023-01-18 10:32:12 -05:00
Sanjay Patel	830ac677b7	[InstCombine] reduce code duplication in visitSub(); NFC	2023-01-18 10:17:07 -05:00
Florian Hahn	feee22db52	[VPlan] Disconnect VPRegionBlock from successors in graph iterator(NFCI) This updates the VPAllSuccessorsIterator to not connect the VPRegionBlock itself to its successors. The successors are connected to the exit block of the region. At the moment, this doesn't change any exisint functionality. But the new schema ensures the following property when used for VPDominatorTree: 1. Entry & exit blocks of regions dominate the successors of the region. This allows for convenient checking of dominance between defs and uses that are not defined in the same region. I will share a follow-up patch to use it for the VPDominatorTree soon. Depends on D140500. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D140511	2023-01-18 15:02:41 +00:00
Florian Hahn	22c9f4cf2d	[VPlan] Replace VPInterleaveRecipe::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-18 14:23:22 +00:00
Sanjay Patel	c2ab7e2abd	[InstCombine] simplify code for matching shift-logic-shift pattern; NFC We can match and capture in one statement. Also, make the code more closely resemble the description comment by using the constant name of an operand value.	2023-01-18 08:13:37 -05:00
Florian Hahn	f615de7e26	[VPlan] Replace VPBranchOnMaskSC::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-18 12:14:58 +00:00
Matt Arsenault	e7cd42f8e4	Utils: Add utility pass to lower ifuncs Create a global constructor which will initialize a global table of function pointers. For now, this is only used as a reduction technique for llvm-reduce. In the future this may be useful to support ifunc on systems where the program loader doesn't natively support it.	2023-01-17 22:33:56 -05:00
Arthur Eubanks	c43f38ec63	Revert ""Reland "[pgo] Avoid introducing relocations by using private alias"" This reverts commit 6e5cbc097a5ac7fa95a8f425af8b03958151c763. Causes link errors, see http://go/crb/1408161.	2023-01-17 15:41:26 -08:00
Florian Hahn	cdd8fcdbd7	[VPlan] Replace VPExpandSCEVRecipe::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-17 21:11:33 +00:00
Florian Hahn	bf1ba6bb52	[VPlan] Replace VPScalarIVStepsRecipe::classof with VP_CLASSOF_IMPL(NFC)	2023-01-17 20:53:14 +00:00
Sanjay Patel	68c197f07e	[InstCombine] factor difference-of-squares to reduce multiplication (X * X) - (Y * Y) --> (X + Y) * (X - Y) https://alive2.llvm.org/ce/z/BAuRCf The no-wrap propagation could be relaxed in some cases, but there does not seem to be an obvious rule for that.	2023-01-17 14:58:40 -05:00
Anshil Gandhi	2449cbabdd	[InstCombine] Handle PHI nodes in PtrReplacer This patch adds on to the functionality implemented in rG42ab5dc5a5dd6c79476104bdc921afa2a18559cf, where PHI nodes are supported in the use-def traversal algorithm to determine if an alloca ever overwritten in addition to a memmove/memcpy. This patch implements the support needed by the PointerReplacer to collect all (indirect) users of the alloca in cases where a PHI is involved. Finally, a new PHI is defined in the replace method which takes in replaced incoming values and updates the WorkMap accordingly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D136201	2023-01-17 10:56:03 -07:00
Nikita Popov	61bb549cfd	[CVP] Avoid duplicate range calculation (NFC) Calculate the range once for all the sdiv/srem transforms.	2023-01-17 16:54:51 +01:00
Nikita Popov	004e613ce4	[CVP] Avoid duplicate range calculation (NFC) Calculate the range once and use it in processURem() and narrowUDivOrURem().	2023-01-17 16:39:27 +01:00
Nikita Popov	a444fe07dd	[CVP] Handle use-site conditions in domain-based folds As a side-effect, this switchem them to use getConstantRange() rather than getPredicateAt(). getPredicateAt() is not supposed to be more powerful than getConstantRange() for non-equality comparisons (as long as block values are used).	2023-01-17 16:35:18 +01:00
Nikita Popov	5c38c6a3aa	[CVP] Handle use-site conditions in more folds	2023-01-17 16:14:55 +01:00
Florian Hahn	d47bdae28e	[VPlan] Remove duplicated VPValue IDs (NFCI). At the moment, both VPValue and VPDef have an ID used when casting via classof. This duplication is cumbersome, because it requires adding IDs for new recipes twice and also requires setting them twice. In a few cases, there's only a VPDef ID and no VPValue ID, which can cause same confusion. To simplify things, remove the VPValue IDs for different recipes. Instead, only retain the generic VPValue ID (= used VPValues without a corresponding defining recipe) and VPVRecipe for VPValues that are defined by recipes that inherit from VPValue. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D140848	2023-01-17 15:11:38 +00:00
luxufan	0ad5909958	[InstCombine] Don't combine smul of i1 type constant one Fixes: https://github.com/llvm/llvm-project/issues/59876 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D141214	2023-01-17 22:04:48 +08:00
Florian Hahn	c95138392a	[VPlan] Remove unnecessary getNumSuccessors call (NFC). If ParentWithSuccs is nullptr, the number of successors is guaranteed to be 0. Simplify the code as suggested by @Ayal in D140511.	2023-01-17 11:44:50 +00:00
Florian Hahn	133f017479	[VPlan] Remove unneeded VPUser::classof(const VPDef *) (NFC). This specialization is not needed any longer as VPRecipeBase inherits from VPUser and getDefiningRecipe returns a VPRecipeBase.	2023-01-17 09:08:33 +00:00
Sergey Kachkov	bfd2dd49ff	[GVN] Refactor handling of pointer-select in GVN pass This patch extends Def memory dependency with support of select instructions to consistently handle pointer-select conversion. Differential Revision: https://reviews.llvm.org/D141619	2023-01-17 11:32:06 +03:00
Joe Loser	a288d7f937	[llvm][ADT] Replace uses of `makeMutableArrayRef` with deduction guides Similar to how `makeArrayRef` is deprecated in favor of deduction guides, do the same for `makeMutableArrayRef`. Once all of the places in-tree are using the deduction guides for `MutableArrayRef`, we can mark `makeMutableArrayRef` as deprecated. Differential Revision: https://reviews.llvm.org/D141814	2023-01-16 14:49:37 -07:00
Ram-NK	ee7188c8b2	[LoopInterchange] Correcting the profitability check Before D135808, There would be endless loop interchange posibility (no proper priority was there in profitability check. Any profitable check may leads to loop-interchange). With this patch, there is no endless interchange (priority in profitable check is defined. Order of decision is 'Cache cost' check, 'InstrOrderCost', 'Vectorization'). Corrected the dependency checking inside isProfitableForVectorization(), corrected the checking of bad order loops in isProfitablePerInstrOrderCost(). Reviewed By: Meinersbur, bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D135808	2023-01-16 14:36:06 -05:00
Sanjay Patel	dedc58da49	[InstCombine] canonicalize a signum (spaceship) that ends in add (A s>> (BW - 1)) + (zext (A s> 0)) --> (A s>> (BW - 1)) \| (zext (A != 0)) https://alive2.llvm.org/ce/z/V-nM8N This is not the form that we currently match as m_Signum(), but I'm not sure if one is better than the other, so there's a follow-up patch needed either way. For this patch, it should be better for analysis to use a not-null test and bitwise logic rather than >0 with add. Codegen doesn't seem significantly different on any targets that I looked at. Also note that none of these variants is shown in issue #60012 - those generally include at least one 'select', so that's likely where these patterns will end up.	2023-01-16 12:47:21 -05:00
Guillaume Chatelet	135f23d67b	Deprecate MemIntrinsicBase::getDestAlignment() and MemTransferBase::getSourceAlignment() Differential Revision: https://reviews.llvm.org/D141840	2023-01-16 14:22:03 +00:00
Florian Hahn	a6549718d9	[LoopUnroll] Don't update DT for changeToUnreachable. There is no need to update the DT here, because there must be a unique latch. Hence if the latch is not exiting it must directly branch back to the original loop header and does not dominate any nodes. Skipping a DT update here simplifies D141487. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D141810	2023-01-16 12:25:34 +00:00
Sergey Kachkov	868abc471d	Revert "[GVN] Refactor handling of pointer-select in GVN pass" This reverts commit fc7cdaa373308ce3d72218b4d80101ae19850a6c.	2023-01-16 15:13:17 +03:00
Max Kazantsev	82cee24e3d	[JumpThreading] Preserve profile metadata during select unfolding, take 2 Jump threading can replace select and unconditional branch with conditional branch, but when doing so loses profile information. This destructive transform can eventually lead to a performance degradation due to folding of branches in shouldFoldCondBranchesToCommonDestination as branch probabilities are no longer known. The first version was reverted due to assert caused by i32 overflow, fixed in this version. Patch by Roman Paukner! Differential Revision: https://reviews.llvm.org/D138132 Reviewed By: mkazantsev	2023-01-16 19:04:23 +07:00
Sergey Kachkov	fc7cdaa373	[GVN] Refactor handling of pointer-select in GVN pass This patch introduces new type of memory dependency - Select to consistently handle it like Def/Clobber dependency. Differential Revision: https://reviews.llvm.org/D141619	2023-01-16 14:12:28 +03:00
Florian Hahn	56ffd39c3d	[VPlan] Use VPDef prefix for VPDef IDs instead of VPRecipeBase (NFC). Various places in the code where still using the VPRecipeBase:: prefix for VPDef IDs or not prefix at all. Now that the VPDef IDs have been moved to VPDef, use this prefix instead and consistently use it.	2023-01-16 10:23:52 +00:00
Craig Topper	8e317e693a	[InstCombine] Remove dead code from foldICmpShlOne. NFC This code handles (icmp eq/ne (1 << Y), C) if C is a power of 2. This case is also handled by the more general foldICmpShlConstConst which is called before we reach foldICmpShlOne.	2023-01-15 19:10:17 -08:00
Benjamin Kramer	db6961db7a	[FuncitonComparator] Clamp StringRef compare output to [-1,1] The comparison can have different values (but same sign) on big endian platforms, avoid that to make the unit test green there.	2023-01-16 01:44:55 +01:00
Craig Topper	77f2f34d69	[InstCombine] Generalize (icmp sgt (1 << Y), -1) -> (icmp ne Y, BitWidth-1) to any negative constant. Similar for the sle version which will be canonicalized to slt first. Alive2 proof as implemented: https://alive2.llvm.org/ce/z/_YawdM @spatel's original Alive2: https://alive2.llvm.org/ce/z/3YB2vs Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D141773	2023-01-15 13:36:57 -08:00

1 2 3 4 5 ...

32643 Commits