llvm-project

Author	SHA1	Message	Date
Alexey Bataev	d65cc85977	[SLP]Do not schedule instructions with constants/argument/phi operands and external users. No need to schedule entry nodes where all instructions are not memory read/write instructions and their operands are either constants, or arguments, or phis, or instructions from others blocks, or their users are phis or from the other blocks. The resulting vector instructions can be placed at the beginning of the basic block without scheduling (if operands does not need to be scheduled) or at the end of the block (if users are outside of the block). It may save some compile time and scheduling resources. Differential Revision: https://reviews.llvm.org/D121121	2022-03-17 11:03:45 -07:00
Julian Lettner	22570bac69	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-17 10:47:13 -07:00
Ellis Hoag	84c6689b15	[AlwaysInliner] Check inliner errors even without assserts When we build clang without asserts we should still check the result of `InlineFunction()` to be sure there wasn't an error. Otherwise we could incorrectly merge attributes in the next line. This also removes a redundent call to `getCaller()`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121722	2022-03-17 10:16:23 -07:00
Fraser Cormack	fe74183564	[Coroutines][NFC] Format line to 80 cols	2022-03-17 15:34:24 +00:00
Marco Elver	cbe1e67ead	[Instruction] Introduce getAtomicSyncScopeID() An analysis may just be interested in checking if an instruction is atomic but system scoped or single-thread scoped, like ThreadSanitizer's isAtomic(). Unfortunately Instruction::isAtomic() can only answer the "atomic" part of the question, but to also check scope becomes rather verbose. To simplify and reduce redundancy, introduce a common helper getAtomicSyncScopeID() which returns the scope of an atomic operation. Start using it in ThreadSanitizer. NFCI. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D121910	2022-03-17 14:59:37 +01:00
Florian Hahn	151c144350	[LV] Use usesScalars in widenPHIInstruction. This uses the existing VPlan helpers to check whether there are scalar uses of a phi recipe. It remove one of the few remaining dependencies on the cost model from VPlan code generation. Depends on D121612. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121613	2022-03-17 13:16:32 +00:00
Florian Hahn	a6e70e4056	[VPlan] VPInterleaveRecipe only requires the first lane of the address. VPInterleaveRecipe only uses the first lane of the address. Add onlyFirstLaneUsed implementation. This is needed for a follow-up patch. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121612	2022-03-17 11:56:43 +00:00
Nikita Popov	1dbeb64493	[SLP] Avoid unnecessary getIncomingValueForBlock() call (NFC) This code just wants to check all incoming values, we don't care care what the incoming block is here.	2022-03-17 12:23:46 +01:00
Nikita Popov	4010a7a5d0	Reapply [InstCombine] Support switch in phi to cond fold Reapply with an explicit check for multi-edges, as the expected behavior of multi-edge dominance is unclear (D120811). ----- For conditional branches, we know the value is i1 0 or i1 1 along the outgoing edges. For switches we can apply exactly the same optimization, just with the known values determined by the switch cases.	2022-03-17 10:03:09 +01:00
Alexey Bataev	150ea76543	Revert "[SLP]Do not schedule instructions with constants/argument/phi operands and external users." This reverts commit 1eeb2bfe727323332800e8d390f2f8c63c953779 to fix a bug reported in https://reviews.llvm.org/D121121	2022-03-16 13:54:59 -07:00
Florian Hahn	470a975c84	[ConstraintElimination] Add missing dominance check. When dealing with an unconditional branch, the condition can only added if BB properly dominates the successor.	2022-03-16 20:01:24 +00:00
Malhar Jajoo	a36d269658	[VPlan] Avoid collecting scalars for SVE This patch ensures scalars (except for uniforms) are no longer collected (prior to LVP planning phase) for scalable vectorization. This is to avoid the chances of generating scalarized instructions later (during LVP execute phase) as they are not supported for scalable vectorization. Relevant test has also been added. Differential Revision: https://reviews.llvm.org/D121452	2022-03-16 16:33:34 +00:00
Nikita Popov	d7cf7ec05d	[SROA] Handle over-large loads during presplitting When a load extends past the extent of the alloca, SROA will restrict the slice size to extend to the end of the alloca only. However, presplitting was asserting that the load size and the slice size match exactly, which does not hold in this case. Relax the assertion to only require that the load size is greater or equal than the slice size.	2022-03-16 15:41:11 +01:00
Florian Hahn	f473d4aa80	[ConstraintElimination] Support BBs with single successor in CanAdd. If BB has a single successor, conditions can be added safely.	2022-03-16 14:13:52 +00:00
Alexey Bataev	1eeb2bfe72	[SLP]Do not schedule instructions with constants/argument/phi operands and external users. No need to schedule entry nodes where all instructions are not memory read/write instructions and their operands are either constants, or arguments, or phis, or instructions from others blocks, or their users are phis or from the other blocks. The resulting vector instructions can be placed at the beginning of the basic block without scheduling (if operands does not need to be scheduled) or at the end of the block (if users are outside of the block). It may save some compile time and scheduling resources. Differential Revision: https://reviews.llvm.org/D121121	2022-03-16 06:05:43 -07:00
Florian Hahn	e5822ded56	[FunctionAttrs] Infer argmemonly . This patch adds initial argmemonly inference, by checking the underlying objects of locations returned by MemoryLocation. I think this should cover most cases, except function calls to other argmemonly functions. I'm not sure if there's a reason why we don't infer those yet. Additional argmemonly can improve codegen in some cases. It also makes it easier to come up with a C reproducer for 7662d1687b09 (already fixed, but I'm trying to see if C/C++ fuzzing could help to uncover similar issues.) Compile-time impact: NewPM-O3: +0.01% NewPM-ReleaseThinLTO: +0.03% NewPM-ReleaseLTO+g: +0.05% https://llvm-compile-time-tracker.com/compare.php?from=067c035012fc061ad6378458774ac2df117283c6&to=fe209d4aab5b593bd62d18c0876732ddcca1614d&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121415	2022-03-16 10:24:33 +00:00
Nikita Popov	20531b3a6b	[RelLookupTableConverter] Avoid querying TTI for declarations This code queries TTI on a single function, which is considered to be representative. This is a bit odd, but probably fine in practice. However, I think we should at least avoid querying declarations, which e.g. will generally lack target attributes, and for which we don't seem to ever query TTI in other places.	2022-03-16 10:39:28 +01:00
Philip Reames	1cfa986d68	[SLP] Optionally preserve MemorySSA This initial patch adds code to preserve MemorySSA through a run of SLP vectorizer. The eventual plan is to use MemorySSA to accelerate SLP's memory dependence checking, but we're a ways from that. In particular, this patch is correct, but really slow. It's being landed so that we can work incrementally in tree, not because it's expected to be useful to anyone just yet. The broader effort is being tracked in https://github.com/llvm/llvm-project/issues/54256. Its worth noting expicitly that this may not work out, and if not, we will be reverting all of the MSSA support in SLP at some point in the next few weeks. Differential Revision: https://reviews.llvm.org/D117926	2022-03-15 16:36:15 -07:00
Florian Hahn	014f5bcf7a	[FunctionAttrs] Replace MemoryAccessKind with FMRB. Update FunctionAttrs to use FunctionModRefBehavior instead MemoryAccessKind. This allows for adding support for inferring argmemonly and others, see D121415. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121460	2022-03-15 19:35:54 +00:00
Sanjay Patel	598721f866	[InstCombine] try harder to propagate 'nsz' through fneg-of-select This can be viewed as swapping the select arms: https://alive2.llvm.org/ce/z/jUvFMJ ...so we don't have the 'nsz' problem with the more general fold. This unlocks other folds for the motivating fabs example. This was discussed in issue #38828.	2022-03-15 11:05:29 -04:00
Simon Pilgrim	7e4cf582cf	[InstCombine] Add general constant support to eq/ne icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold A further extension for Issue #32161 For eq/ne comparisons - the sign mismatch and bounds constraints are redundant, so if the that fold fails, fallback and just fold the constants directly. https://alive2.llvm.org/ce/z/cdodNQ The loop rotation test change looks mostly benign - the backend doesn't seem to suffer? https://gcc.godbolt.org/z/dErMY78To Differential Revision: https://reviews.llvm.org/D121551	2022-03-15 14:17:38 +00:00
Simon Pilgrim	7262eacd41	Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'	2022-03-15 13:01:35 +00:00
Nikita Popov	875782bd9e	[OpenMPOpt] Avoid pointer element type access during region merging Hardcode the function type as ParallelTask, which is the guaranteed pointee type of this runtime function argument (if pointee types exist). The elimination of the callee bitcast is left for InstCombine. Differential Revision: https://reviews.llvm.org/D120885	2022-03-15 09:52:46 +01:00
Florian Hahn	ca1b2fc9fb	[LV] Remove LoopVectorBody from InnerLoopVectorizer. (NFCI) Update places still referencing LoopVectorBody to use the vector loop to get the vector loop header. This is needed to move vector loop code-generation to VPlan completely, which in turn is needed to model pre-header & exit blocks in VPlan as well.	2022-03-15 08:22:31 +00:00
Julian Lettner	9c542a5a4e	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121327	2022-03-14 17:51:18 -07:00
Andrew Browne	dbf8c00b09	[DFSan] Remove trampolines to unblock opaque pointers. (Reland with fix) https://github.com/llvm/llvm-project/issues/54172 Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D121250	2022-03-14 16:03:25 -07:00
Andrew Litteken	228cc2c38b	[IROutliner] Ensure merged PHINodes respect order and incoming blocks, not just incoming values When matching PHINodes when margining functions the IROutliner only checks that an incoming value exists in phi node in overall function. It doesn't check the length, the order, or that the incoming block also matches. In the given example, we see that both phi nodes have the same incoming values, but from different blocks. The fix is to to enforce stricter a match of the incoming value, and the incoming block as well when matching the created phi nodes. Reviewers: paquette Differential Revision: https://reviews.llvm.org/D121310	2022-03-14 16:48:21 -05:00
Craig Topper	ce78e68261	[InstCombine] Fold select based logic of fcmps with same operands when FMF is present. If we have a logical and/or in select form and the true/false operand is an fcmp with poison generating FMF, we won't be able to fold it to an and/or instruction. This prevents us from optimizing the case where it is a logical operation of two fcmps with identical operands. This patch adds explicit checks for this case that doesn't rely on converting to and/or to do the optimization. It reuses the existing foldLogicOfFCmps, but adds a new flag to disable the other combine that is inside that function. FMF flags from the two FCmps are intersected using the logic added in D121243. The FIXME has been updated to indicate that we can only use a union for the non-select form. This allows us to optimize cases like this from compare-fp-3.c in the gcc torture suite with fast math. void test1 (float x, float y) { if ((x==y) && (x!=y)) link_error0(); } Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D121323	2022-03-14 14:45:07 -07:00
Nick Desaulniers	236695e70c	[IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC 2 of the 3 callsite of IRMover::move() pass empty lambda functions. Just make this parameter llvm::unique_function. Came about via discussion in D120781. Probably worth making this change regardless of the resolution of D120781. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D121630	2022-03-14 14:37:34 -07:00
Andrew Browne	edc33fa569	Revert "[DFSan] Remove trampolines to unblock opaque pointers." This reverts commit 84af90336fed36f7dfdc468ded39236f32bbb82e.	2022-03-14 13:47:41 -07:00
Andrew Browne	84af90336f	[DFSan] Remove trampolines to unblock opaque pointers. https://github.com/llvm/llvm-project/issues/54172 Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D121250	2022-03-14 13:39:49 -07:00
Andrew Litteken	c79ab1065e	[IROutliner] Separate split PHI nodes from multiple exits by different outlinable regions. The IR Outliner is supposed to extract the outputs contained in an external phi node and place them into a phi node contained within the outlined function. However, when the output values of two outlined functions with two different output sets are contained within the same phi node, they are counted as the same exit path when first analyzed. In reality, these create two different phi nodes, creating an inconsistency, resulting in a mismatch in the expected number of output paths and a crash. This fixes that counting when analyzing the outputs by also analyzing the incoming blocks rather than just the incoming values. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D121313	2022-03-14 14:56:59 -05:00
Florian Hahn	4a0481e981	[LV] Check for users of truncated IVs, add more detailed comment. Add missing outside user check for truncated IVs. Also hoist the code in the helper with additional explanations. Fixes #54370.	2022-03-14 19:39:30 +00:00
Teresa Johnson	fee0bde4c6	[WPD] Extend checking mode to support fallback to indirect call Extend -wholeprogramdevirt-check to support both the existing trapping mode on an incorrect devirtualization, as well as a new mode to fallback to an indirect call on a mismatch. The new mode is The new mode is useful in cases where we want to enable devirtualization but cannot fully guarantee whole program visibility (e.g in the case where LTO has been disabled for a small set of objects that could potentially override virtual methods without having a symbol reference to anything in the base class including the vtable). Remove !prof and !callees metadata (which are used by indirect call promotion) from both the new direct call and the fallback indirect call (so that we don't perform another round of promotion on the latter). Also remove it from the direct call in the non-fallback cases, which was an oversight, although it didn't seem to cause any issues. Add tests for the metadata removal covering the various cases. Differential Revision: https://reviews.llvm.org/D121419	2022-03-14 10:16:28 -07:00
Andrew Litteken	3c90812f3b	[IROutliner] Avoid reusing PHINodes that have already been matched when merging outlined functions' phi node blocks When there are two external phi nodes for two different outlined regions, when compressing the created phi nodes between the two regions, the matching for the second phi node in the second region matches the first phi node created for the first region rather than the second phi node created for the first region. This adds an extra output path where there should not be one. The fix is the ignore phi nodes that have already been matched for each region. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D121312	2022-03-14 12:00:01 -05:00
Nikita Popov	8361c5da30	[SLPVectorizer] Handle external load/store pointer uses with opaque pointers In this case we may not generate a bitcast, so the new load/store becomes the external user.	2022-03-14 16:55:09 +01:00
Florian Hahn	d621ae30e2	[LV] Remove dead Loop argument from emitMinimumVector... (NFC) The argument is not used, remove it.	2022-03-14 15:47:40 +00:00
Florian Hahn	3ee2d908a9	[LV] Remove dead Loop argument from emitSCEVChecks. (NFC) The argument is not used, remove it.	2022-03-14 13:00:03 +00:00
Nikita Popov	ce6ca00a92	[CoroSplit] Avoid self-replacement With opaque pointers, the bitcast might be a no-op, and this can end up trying to replace a value with itself, which is illegal.	2022-03-14 13:53:31 +01:00
Florian Hahn	8896c36624	[LV] Do not set insert point in completeLoopSkeleton. (NFCI) The insertion point for the builder used during VPlan code generation is set during code generation. Setting the insert point here is dead code and can be removed.	2022-03-14 12:21:26 +00:00
Nikita Popov	3ec44c22b1	[DeadArgElim] Guard against function type mismatch If the call function type and function type don't match, we should consider the function live (there is effectively a bitcast sitting in between).	2022-03-14 13:03:04 +01:00
Nikita Popov	cf18ec445d	[GVN] Check load type in select PRE This is no longer implicitly guaranteed with opaque pointers.	2022-03-14 12:46:54 +01:00
Benoit Jacob	9879c555f2	Expose ScalarizerPass options to C++ (not just commandline) Context: I needed this for https://github.com/google/iree/pull/8474 . I found that TSan instrumentation expects vector sizes to be <= 16, and in my project (IREE) we have tests with higher vector sizes. That left some test functions uninstrumented, resulting in crashes as instrumented code called into them. Differential Revision: https://reviews.llvm.org/D121182	2022-03-14 12:00:35 +01:00
Florian Hahn	1c0fc1f074	[VPlan] Ensure each iv user is only visited once in transform. If a recipe has multiple uses of an IV, we crash. It causes a crash when building llvm-test-suite. Exposed by 95f76bff1c40bc1c2f.	2022-03-13 21:42:17 +00:00
Florian Hahn	95f76bff1c	[LV] Create & use VPScalarIVSteps for all scalar users. This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and scalar uses. It updates all uses that only need scalar values to use the newly created recipe for the scalar steps. This completes untangling of VPWidenIntOrFpInductionRecipe code-generation. Now the recipe only creates the widened vector values, as it says on the tin. The code to genereate IR has been moved directly to VPWidenIntOrFpInductionRecipe::execute. Note that the recipe has been updated to hold a reference to ScalarEvolution, which is needed to expand the step, until we can place the corresponding SCEV expansion in the pre-header. Depends on D120827. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D120828	2022-03-13 17:15:24 +00:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Johannes Doerfert	85daf6973d	[Attributor] Remove capture tracker usage and follow uses explicitly Before we used the capture tracker to follow pointer uses, now we do it explicitly ourselves through the Attributor API. There are multiple benefits: For one, the boilerplate is cut down by a lot. The class, potential copies vector, etc. is all not needed anymore. We also do avoid explicitly looking through memory here, something that was duplicated and should only live in the `checkForAllUses~ helper. More importantly, as we do simplifications we need to make sure all parties are in sync when they reason about uses. The old way did not allow us to do this but the new one does as every use visiting AA goes through `checkForAllUses` now..	2022-03-11 22:56:16 -06:00
Johannes Doerfert	f44f60a297	[Attributor] Avoid replacing return operands twice As replacements will become more complex it is better to have a single AA responsible for replacing a use. Before this patch AAValueSimplify* and AAValueSimplifyReturned could both try to replace the returned value. The latter was marginally better for the old pass manager when a function was already carrying a `returned` attribute and when the context of the return instruction was important. The second shortcoming was resolved by looking for return attributes in the AAValueSimplifyCallSiteReturned initialization. The old PM impact is not concerning. This is yet another step towards the removal of AAReturnedValues, the very first AA we should now try to eliminate due to the overlapping logic with value simplification.	2022-03-11 21:55:19 -06:00
Johannes Doerfert	55a970fbd4	[Attributor][FIX] Make sure to not ignore non-load users of stores When we look through memory for a store we used to allow any other use of the memory that is reachable. This is generally OK but we need to make sure to actually let the user look at these properly. For now, we simply require loads (via exact reloads).	2022-03-11 18:41:13 -06:00
Johannes Doerfert	f3ad8cf00e	[Attributor] Cleanup manifest and liveness for CGSCC passes There was some ad-hoc handling of liveness and manifest to avoid breaking CGSCC guarantees. Things always slipped through though. This cleanup will: 1) Prevent us from manifesting any "information" outside the CGSCC. This might be too conservative but we need to opt-in to annotation not try to avoid some problematic ones. 2) Avoid running any liveness analysis outside the CGSCC. We did have some AAIsDeadFunction handling to this end but we need this for all AAIsDead classes. The reason is that AAIsDead information is only correct if we actually manifest it, since we don't (see point 1) we cannot actually derive/use it at all. We are currently trying to avoid running any AA updates outside the CGSCC but that seems to impact things quite a bit. 3) Assert, don't check, that our modifications (during cleanup) modifies only CGSCC functions.	2022-03-11 16:46:02 -06:00

1 2 3 4 5 ...

30002 Commits