llvm-project

Author	SHA1	Message	Date
Nikita Popov	7c229f6e85	[GVN] Invalidate MDA when deduplicating phi nodes Duplicate phi nodes were being directly removed, without invalidating MDA. This could result in a new phi node being allocated at the same address, incorrectly reusing a cache entry. Fix this by optionally allowing EliminateDuplicatePHINodes() to collect phi nodes to remove into a vector, which allows GVN to handle removal itself. Fixes https://github.com/llvm/llvm-project/issues/64598. Differential Revision: https://reviews.llvm.org/D158849	2023-09-15 07:04:32 +02:00
Justin Bogner	71e3642619	[Transforms][DXIL] Wire up a basic DXILUpgrade pass (#66275 ) This pass will upgrade DXIL-style llvm constructs (which are mostly metadata) into the representations we use in LLVM for the same concepts. For now we just strip the valver metadata, which we don't need. Later changes will make this pass more useful, and then we should be able to wire it into clang and possibly the DirectX backend's AsmParser.	2023-09-14 11:02:31 -07:00
Kohei Asano	fef8249220	[SimplifyCFG] handle monotonic wrapped case for D150943 (#65882 )	2023-09-14 21:26:11 +09:00
Matthias Braun	b30c9c9378	LoopUnrollRuntime: Add weights to all branches Make sure every conditional branch constructed by `LoopUnrollRuntime` code sets branch weights. - Add new 1:127 weights for the conditional jumps checking whether the whole (unrolled) loop should be skipped in the generated prolog or epilog code. - Remove `updateLatchBranchWeightsForRemainderLoop` function and just add weights immediately when constructing the relevant branches. This leads to simpler code and makes the code more obvious as every call to `CreateCondBr` now has a `BranchWeights` parameter. - Rework formula for epilogue latch weights, to assume equal distribution of remainders and remove `assert` (as I was able to reach this code when forcing small unroll factors on the commandline). Differential Revision: https://reviews.llvm.org/D158642	2023-09-11 14:23:29 -07:00
Jeremy Morse	e54277fa10	[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and updates many call sites to use it. The motivating reason for doing this is given here [0], we'd like to pass around more information about the position of debug-info in the iterator object. That necessitates passing iterators around most of the time. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152468	2023-09-11 20:01:19 +01:00
Matthias Braun	5d7f84ee17	LoopRotate: Add code to update branch weights This adds code to the loop rotation transformation to ensure that the computed block execution counts for the loop bodies are the same before and after the transformation. This isn't always true in practice, but I believe this is because of numeric inaccuracies in the BlockFrequency computation. The invariants this is modeled on and heuristic choice of 0-trip loop amount is explained in a lenghty comment in the new `updateBranchWeights()` function. Differential Revision: https://reviews.llvm.org/D157462	2023-09-11 10:38:06 -07:00
Jeremy Morse	1d82c765ef	[NFC][RemoveDIs] Provide an iterator-taking split-block method As per the stack of patches this is attached to, allow users of BasicBlock::splitBasicBlock to provide an iterator for a position, instead of just an instruction pointer. This is to fit with my proposal for how to get rid of debug intrinsics [0]. There are other call-sites that would need to change, but this is sufficient for a stage2clang self host and some other C++ projects to build identical binaries, in the context of the whole remove-DIs project. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152545	2023-09-11 17:50:47 +01:00
Jeremy Morse	6942c64e81	[NFC][RemoveDIs] Prefer iterator-insertion over instructions Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537	2023-09-11 11:48:45 +01:00
Jeremy Morse	4427407a29	[NFC][RemoveDIs] Create a new spelling of the moveBefore method As outlined in my proposal of how to get rid of debug intrinsics, this patch adds a moveBefore method that signals the caller /intends/ the order of moved instructions is to stay the same. This semantic difference has an effect on debug-info, as it signals whether debug-info needs to move with instructions or not. The patch just replaces a few calls to moveBefore with calls to moveBeforePreserving -- and the latter just calls the former, so it's all NFC right now. A future patch will add an implementation of moveBeforePreserving that takes action to correctly preserve debug-info, but that's tightly coupled with our non-instruction debug-info representation that's still being reviewed. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D156369	2023-09-07 18:37:57 +01:00
kazutakahirata	f8a1c8b7c1	[llvm] Use llvm::any_cast instead of any_cast (NFC) (#65565 ) This patch replaces any_cast with llvm::any_cast. This in turn allows us to gracefully switch to std::any in future by forwarding llvm::Any and llvm::any_cast to: using Any = std::any; template <class T> T any_cast(Any Value) { return std::any_cast<T>(Value); } respectively. Without this patch, it's ambiguous whether any_cast refers to std::any_cast or llvm::any_cast. As an added bonus, this patch makes it easier to mechanically replace llvm::any_cast with std::any_cast without affecting other occurrences of any_cast (e.g. in libcxx).	2023-09-07 09:07:40 -07:00
Mel Chen	26aed5b9a8	[VPlan][LoopUtils] Remove unused parameter TTI This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158148	2023-09-04 05:30:37 -07:00
Kazu Hirata	83e6931827	[llvm] Use llvm::is_contained (NFC)	2023-09-02 09:32:46 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
spupyrev	b4b42bd652	Cleaning up unreachable code in CodeLayout - removing an unreachable instruction from the code (earlier code merge bug); - silencing "unused variable" warnings. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D158859	2023-08-28 09:22:20 -07:00
Nikita Popov	4eafc9b6ff	[IR] Treat callbr as special terminator (PR64215) isLegalToHoistInto() currently return true for callbr instructions. That means that a callbr with one successor will be considered a proper loop preheader, which may result in instructions that use the callbr return value being hoisted past it. Fix this by adding callbr to isExceptionTerminator (with a rename to isSpecialTerminator), which also fixes similar assumptions in other places. Fixes https://github.com/llvm/llvm-project/issues/64215. Differential Revision: https://reviews.llvm.org/D158609	2023-08-25 09:20:18 +02:00
David Sherwood	c02184f286	[LoopVectorize] Allow inner loop runtime checks to be hoisted above an outer loop Suppose we have a nested loop like this: void foo(int32_t dst, int32_t src, int m, int n) { for (int i = 0; i < m; i++) { for (int j = 0; j < n; j++) { dst[(i * n) + j] += src[(i * n) + j]; } } } We currently generate runtime memory checks as a precondition for entering the vectorised version of the inner loop. However, if the runtime-determined trip count for the inner loop is quite small then the cost of these checks becomes quite expensive. This patch attempts to mitigate these costs by adding a new option to expand the memory ranges being checked to include the outer loop as well. This leads to runtime checks that can then be hoisted above the outer loop. For example, rather than looking for a conflict between the memory ranges: 1. &dst[(i * n)] -> &dst[(i * n) + n] 2. &src[(i * n)] -> &src[(i * n) + n] we can instead look at the expanded ranges: 1. &dst[0] -> &dst[((m - 1) * n) + n] 2. &src[0] -> &src[((m - 1) * n) + n] which are outer-loop-invariant. As with many optimisations there is a trade-off here, because there is a danger that using the expanded ranges we may never enter the vectorised inner loop, whereas with the smaller ranges we might enter at least once. I have added a HoistRuntimeChecks option that is turned off by default, but can be enabled for workloads where we know this is guaranteed to be of real benefit. In future, we can also use PGO to determine if this is worthwhile by using the inner loop trip count information. When enabling this option for SPEC2017 on neoverse-v1 with the flags "-Ofast -mcpu=native -flto" I see an overall geomean improvement of ~0.5%: SPEC2017 results (+ is an improvement, - is a regression): 520.omnetpp: +2% 525.x264: +2% 557.xz: +1.2% ... GEOMEAN: +0.5% I didn't investigate all the differences to see if they are genuine or noise, but I know the x264 improvement is real because it has some hot nested loops with low trip counts where I can see this hoisting is beneficial. Tests have been added here: Transforms/LoopVectorize/runtime-checks-hoist.ll Differential Revision: https://reviews.llvm.org/D152366	2023-08-24 12:14:02 +00:00
Nikita Popov	d82f0b74de	[IndVars] Don't assume backedge value is instruction (PR64891) In degenerate cases, the backedge value can be folded to poison. Fixes https://github.com/llvm/llvm-project/issues/64891.	2023-08-22 10:33:33 +02:00
Nikita Popov	1c6e6432ca	[SCEVExpander] Fix incorrect reuse of more poisonous instructions (PR63763) SCEVExpander tries to reuse existing instruction with the same SCEV expression. However, doing this replacement blindly is not safe, because the instruction might be more poisonous. What we were already doing is to drop poison-generating flags on the reused instruction. But this is not the only way that more poison can be introduced. The poison-generating flag might not be directly on the reused instruction, or the poison contribution might come from something like 0 * %var, which folds to 0 but can still introduce poison. This patch fixes the issue in a principled way, by determining which values can contribute poison to the SCEV expression, and then checking whether any additional values can contribute poison to the instruction being reused. Poison-generating flags are dropped if doing that enables reuse. This is a pretty big hammer and does cause some regressions in tests, but less than I would have expected. I wasn't able to come up with a less intrusive fix that still satisfies the correctness requirements. Fixes https://github.com/llvm/llvm-project/issues/63763. Fixes https://github.com/llvm/llvm-project/issues/63926. Fixes https://github.com/llvm/llvm-project/issues/64333. Fixes https://github.com/llvm/llvm-project/issues/63727. Differential Revision: https://reviews.llvm.org/D158181	2023-08-22 09:27:07 +02:00
Nikita Popov	7ed4b7e583	[SCEVExpander] Change getRelatedExistingExpansion() to return bool (NFC) This method is only used to determine whether a related expansion exists, the actual value is unused. Clarify that by renaming get -> has and returning bool.	2023-08-21 11:51:22 +02:00
Aiden Grossman	64da0be1fc	Reland "[NFCi][MergeFunctions] Consolidate Hashing Functions" This is a reland of 28134a29fdedd8972acdfb39223571ddcc15dc59 which was reverted due to behavioral differences between 32 and 64 bit builds that have since been fixed. Differential Revision: https://reviews.llvm.org/D158217	2023-08-19 17:14:08 -07:00
Aiden Grossman	7ff7df1c62	Revert "[NFCi][MergeFunctions] Consolidate Hashing Functions" This reverts commit 28134a29fdedd8972acdfb39223571ddcc15dc59. This patch was causing build failures on multiple buildbots on 32-bit architectures. Reverting now so I can deboug out-of-trunk and resubmit later.	2023-08-19 12:23:16 -07:00
Kazu Hirata	5675f44ceb	[Transforms] Remove unnecessary const from a return type (NFC)	2023-08-19 08:29:51 -07:00
Aiden Grossman	28134a29fd	[NFCi][MergeFunctions] Consolidate Hashing Functions A couple years ago, StructuralHash was created, copying the exact hashing implementation from FunctionComparator (minus a couple small details/refactorings). Since then, the hashing implementation has not diverged, but several other areas, like unit testing, have diverged significantly, with StructuralHash getting more attention in these areas. This patch aims to consolidate the two hashing functions into StructuralHash given they do the exact same thing and having less divergence in areas like unit testing would be beneficial. The original aim at creating a separate StructuralHash was to make the implementation divergent and capture additional details like instruction operands (which neither hashing implementation does currently). The MergeFunctions pass doesn't need these detaisl, but verification of pass return values would benefit from this additional data. Setting an option to calculate these values would allow for divergent behavior where appropriate while reducing code duplication with little runtime overhead. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D158217	2023-08-18 13:15:12 -07:00
Anna Thomas	23f08af2be	[Inline] Avoid incompatible return attributes on deoptimize When updating the return type of deoptimize call during inline, we need to drop incompatible return attributes. This bug was exposed once we relaxed the contraint of adding the attributes through D156844. With that change deoptimize (are not willreturn) will start having return attributes added to it. Fixes https://github.com/llvm/llvm-project/issues/64804. Differential Revision: https://reviews.llvm.org/D158286	2023-08-18 12:55:51 -04:00
Aleksandr Popov	d6e7c162e1	[NFC][GuardUtils] Add util to extract widenable conditions This is the next preparation patch to support widenable conditions widening instead of branches widening. We've added parseWidenableGuard util which parses guard condition and collects all checks existing in the expression tree: D157276 Here we are adding util which walks similar way through the expression tree but looks up for widenable condition without collecting the checks. Therefore llvm::extractWidenableCondition could parse widenable branches with arbitrary position of widenable condition in the expression tree. llvm::parseWidenableBranch which is we are going to get rid of is being replaced by llvm::extractWidenableCondition where it's possible. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D157529	2023-08-18 17:36:05 +02:00
Florian Hahn	b7a95ad467	[SimplifyCFG] Don't sink loads/stores with swifterror pointers. swifterror pointers can only be used as pointer operands of load & store instructions (and as swifterror argument of a call). Sinking loads or stores with swifterror pointer operands would require introducing a select of of the pointer operands, which isn't allowed. Check for this condition in canSinkInstructions. Reviewed By: aschwaighofer Differential Revision: https://reviews.llvm.org/D158083	2023-08-17 09:59:07 +01:00
Nikita Popov	51dfe3cb3b	[IR] Add PHINode::removeIncomingValueIf() (NFC) Add an API that allows removing multiple incoming phi values based on a predicate callback, as suggested on D157621. This makes sure that the removal is linear time rather than quadratic, and avoids subtleties around iterator invalidation. I have replaced some of the more straightforward users with the new API, though there's a couple more places that should be able to use it. Differential Revision: https://reviews.llvm.org/D158064	2023-08-17 09:09:14 +02:00
Sameer Sahasrabuddhe	8dce4c56dd	[Inliner] Handle convergence control when inlining a call When a convergencectrl token is passed to a convergent call, and the called function in turn calls the entry intrinsic, the intrinsic is now now replaced with the convergencectrl token. The spec requires the following check: A call from function F to function G can be inlined only if: - at least one of F or G does not make any convergent calls, or, - both F and G make the same kind of convergent calls: controlled or uncontrolled. But this change does not implement this complete check. A proper implemenation require a whole new analysis that identifies convergence in every function. For now, we skip that and just do a cursory check for the entry intrinsic. The underlying assumption is that in a compiler flow that fully implements convergence control tokens, there is no mixing of controlled and uncontrolled convergent operations in the whole program. This is a reboot of the original change D85606 by Nicolai Haehnle <nicolai.haehnle@amd.com>. Reviewed By: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D152431	2023-08-17 09:56:25 +05:30
Noah Goldstein	4d51c6258e	[Inliner] Add return attributes to callsites not marked `willreturn`/`nounwind` The actual callsite we are adding to doesn't need to be `willreturn`/`nounwind`, only ever instructions between the callsite and the return. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D156844	2023-08-16 22:43:04 -05:00
Noah Goldstein	612a7f0b15	[Inliner] Add the callsites called function return attributes to set addable attributes We can do this by just querying attribute in the callsite itself. This is both cleaner code and produces bette results. Differential Revision: https://reviews.llvm.org/D156843	2023-08-16 22:43:04 -05:00
Nikita Popov	fb0c50be5b	[MoveAutoInit] Gracefully handle auto-init annotation on unexpected instr (PR64661) Abort the transform instead of asserting. Fixes https://github.com/llvm/llvm-project/issues/64661.	2023-08-15 16:21:25 +02:00
Serguei Katkov	06dfc8400d	[Local] Mark Instruction argument of wouldInstructionBeTriviallyDead as const. NFC. wouldInstructionBeTriviallyDead is not expected to modify instruction, so mark argument as const to allow its usage in other non-modifying instructions callers. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D157834	2023-08-14 14:49:51 +07:00
Bjorn Pettersson	91157a0b26	[LegacyPM] Drop unused includes in passes no longer supporting legacy PM	2023-08-13 16:46:57 +02:00
Bjorn Pettersson	a7ee80fab2	[llvm] Drop some more typed pointer bitcasts etc.	2023-08-13 16:46:56 +02:00
Nikita Popov	5820c9257e	[SCEVExpander] Use early continue and move comment (NFC) In preparation for adding additional checks here.	2023-08-11 16:56:02 +02:00
Rahul Anand Radhakrishnan	18423c7e1f	[SCCP] Do not attempt to create constexpr for a scalable vector GEP Scalable vector GEPs are not constants and trying to create one for these GEPs causes an assertion failure. Reviewed By: nikic, paulwalker-arm Differential Revision: https://reviews.llvm.org/D157590	2023-08-11 11:06:07 +00:00
Matt Arsenault	25bc999d1f	Intrinsics: Add type overload to stacksave and stackstore This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but this is enough to fix existing tests. https://reviews.llvm.org/D156666	2023-08-09 18:33:11 -04:00
Alexander Kornienko	0b779b0daa	Revert "[AggressiveInstCombine] Fold strcmp for short string literals" This reverts commit 5dde755188e34c0ba5304365612904476c8adfda, cbfcf90152de5392a36d0a0241eef25f5e159eef and 8981520b19f2d2fe3d2bc80cf26318ee6b5b7473 due to a miscompile introduced in 8981520b19f2d2fe3d2bc80cf26318ee6b5b7473 (see https://reviews.llvm.org/D154725#4568845 for details) Differential Revision: https://reviews.llvm.org/D157430	2023-08-08 22:53:45 +02:00
Bjorn Pettersson	4ce7c4a92a	[llvm] Drop some typed pointer handling/bitcasts Differential Revision: https://reviews.llvm.org/D157016	2023-08-03 22:54:33 +02:00
Mel Chen	425e9e81a0	[LV] Rename the Select[I\|F]Cmp reduction pattern to [I\|F]AnyOf. (NFC) Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261 Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D155786	2023-08-03 00:37:19 -07:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Jordan Rupprecht	f5b5a30858	Revert "[CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it" This reverts commit 0b1d1cdb89322c277baf5221218a830195fef9d4. It causes a clang crash. Details will be posted to D153638.	2023-08-01 23:08:55 -07:00
Momchil Velikov	0b1d1cdb89	[CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153638	2023-08-01 18:07:03 +01:00
spupyrev	bc59faa863	A new code layout algorithm for function reordering [2/3] We are bringing a new algorithm for function layout (reordering) based on the call graph (extracted from a profile data). The algorithm is an improvement of top of a known heuristic, C^3. It tries to co-locate hot and frequently executed together functions in the resulting ordering. Unlike C^3, it explores a larger search space and have an objective closely tied to the performance of instruction and i-TLB caches. Hence, the name CDS = Cache-Directed Sort. The algorithm can be used at the linking or post-linking (e.g., BOLT) stage. The algorithm shares some similarities with C^3 and an approach for basic block reordering (ext-tsp). It works with chains (ordered lists) of functions. Initially all chains are isolated functions. On every iteration, we pick a pair of chains whose merging yields the biggest increase in the objective, which is a weighted combination of frequency-based and distance-based locality. That is, we try to co-locate hot functions together (so they can share the cache lines) and functions frequently executed together. The merging process stops when there is only one chain left, or when merging does not improve the objective. In the latter case, the remaining chains are sorted by density in the decreasing order. Complexity We regularly apply the algorithm for large data-center binaries containing 10K+ (hot) functions, and the algorithm takes only a few seconds. For some extreme cases with 100K-1M nodes, the runtime is within minutes. Perf-impact We extensively tested the implementation extensively on a benchmark of isolated binaries and prod services. The impact is measurable for "larger" binaries that are front-end bound: the cpu time improvement (on top of C^3) is in the range of [0% .. 1%], which is a result of a reduced i-TLB miss rate (by up to 20%) and i-cache miss rate (up to 5%). Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D152834	2023-07-27 09:20:53 -07:00
Ramkumar Ramachandra	23caf9e9e7	Local: fix debug output of replaceDominatedUsesWith() The debug output of replaceDominatedUsesWith() prints incorrect information, and the user is left confused about what exactly was replaced. Fix this. Differential Revision: https://reviews.llvm.org/D156318	2023-07-27 13:23:38 +01:00
Ivan Kosarev	e9df4c9892	[ADT] Support iterating size-based integer ranges. It seems the ranges start with 0 in most cases. Reviewed By: dblaikie, gchatelet Differential Revision: https://reviews.llvm.org/D156135	2023-07-26 16:28:41 +01:00
Teresa Johnson	5986559caa	[SimplifyCFG] Guard branch folding by speculate blocks flag Guard FoldBranchToCommonDest in SimplifyCFG with the SpeculateBlocks flag as it can also speculate instructions. This was split out of D155997. Differential Revision: https://reviews.llvm.org/D156194	2023-07-25 06:46:19 -07:00
Nuno Lopes	9007d0e0b6	[UnifyLoopExits] Use poison instead of undef as placeholder [NFC] This pass creates phi nodes where only one of the incoming values is used. The remaining ones can be poison.	2023-07-22 22:38:10 +01:00
Nuno Lopes	3bc74bed64	[Inline] Use poison instead of undef as placeholder [NFC]	2023-07-22 13:23:40 +01:00
Nuno Lopes	9f90669571	[SimplifyCFG] Use poison instead of undef as placeholder [NFC] This is used in a phi node that is created for which only 1 value is accessed (the non-poison)	2023-07-22 12:44:21 +01:00

1 2 3 4 5 ...

7006 Commits