llvm-project

Author	SHA1	Message	Date
Florian Hahn	e47d220230	[LV] Use getVectorLoopRegion to retrieve header. (NFC) Update all places that currently assume the entry block to the plan is also the vector loop header to use getVectorLoopRegion instead. getVectorLoopRegion will keep doing the right thing when the pre-header is modeled explicitly (and becomes the new entry block in the plan).	2022-03-25 16:57:12 +00:00
Philip Reames	d9756fa723	[slp] Factor out a lambda to avoid uplicating code a third time in upcoming patch [nfc]	2022-03-25 09:02:39 -07:00
Fraser Cormack	2e44b7872b	[VectorCombine] Insert addrspacecast when crossing address space boundaries We can not bitcast pointers across different address spaces. This was previously fixed in D89577 but then in D93229 an enhancement was added which peeks further through the ponter operand, opening up the possibility that address-space violations could be introduced. Instead of bailing as the previous fix did, simply insert an addrspacecast cast instruction. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D121787	2022-03-24 19:08:08 +00:00
Simon Pilgrim	597aefa89c	Fix unused variable warning by embedding inside assertion	2022-03-24 17:41:24 +00:00
Florian Hahn	46432a0088	[VPlan] Add VPWidenPointerInductionRecipe. This patch moves pointer induction handling from VPWidenPHIRecipe to its own recipe. In the process, it adds all information required to generate code for pointer inductions without relying on Legal to access the list of induction phis. Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121615	2022-03-24 14:58:45 +00:00
Alexey Bataev	20973c0841	[SLP][NFC]Fix param name in comments, NFC.	2022-03-24 05:58:42 -07:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit e6ead19b774718113007ecb1a4449d7af0cbcfeb.	2022-03-23 18:32:17 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit 27bd8f94928201f87f6b659fc2228efd539e8245. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
serge-sans-paille	1b89c83254	Cleanup includes: Transforms/Instrumentation & Transforms/Vectorize Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122181	2022-03-23 11:06:13 +01:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit f7d7d2a08d16356c57f6d2d36bc2fc0589a55df9.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Florian Hahn	50c8588e44	[LV] Remove Loop argument from createInductionResumeValues (NFCI). createInductionResumeValues only uses its loop argument only to get the pre-header, but the pre-header is already known (we created/cached it earlier). Remove the unneeded loop argument.	2022-03-22 14:23:12 +00:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
Alexey Bataev	79a182371e	[SLP]Make stricter check for instructions that do not require scheduling. Need to check that the instructions with external operands can be reordered safely before actualy exclude them from the scheduling.	2022-03-21 06:09:12 -07:00
Sophia	72bde608d2	[LV] Fix typo in comment Reviewed by: fhahn (Florian Hahn) Differential Revision: https://reviews.llvm.org/D121781	2022-03-21 20:30:05 +08:00
Florian Hahn	0ebac76e6e	[LV] Remove unneeded Loop argument from completeLoopSkeleton. (NFCI) completeLoopSkeleton only uses its loop argument only to get the pre-header, but the pre-header is already known (we created/cached it earlier). Remove the unneeded loop argument.	2022-03-21 10:07:25 +00:00
Florian Hahn	487629cc61	[LV] Remove dead Loop argument from emitMemRuntimeChecks. (NFC)	2022-03-20 21:01:15 +00:00
Philip Reames	b7806c8b37	[SLP] Explicit track required stacksave/alloca dependency The semantics of an inalloca alloca instruction requires that it not be reordered with a preceeding stacksave intrinsic call. Unfortunately, there's no def/use edge or memory dependence edge. (THe memory point is slightly subtle, but in general a new allocation can't alias with a call which executes strictly before it comes into existance.) I'd tried to tackle this same case previously in 689babdf6, but the fix chosen there turned out to be incomplete. As such, this change contains a fully revert of the first fix attempt. This was noticed when investigating problems which surfaced with D118538, but this is definitely an existing bug. This time around, I managed to reduce a couple of additional cases, including one which was being actively miscompiled even without the new scheduling change. (See test diffs) Compile time wise, we only spend extra time when seeing a stacksave (rare), and even then we walk the block at most once per schedule window extension. Likely a non-issue.	2022-03-20 13:58:45 -07:00
Kazu Hirata	bce1bf0ee2	[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 10:41:22 -07:00
Philip Reames	6253b77da9	[SLP] Respect control dependence within a block during scheduling This fixes an active miscompile visible in the test changes. The basic problem is that the scheduling dependency graph didn't have any edges for control dependence within a single basic block. The result is that we could (and in some rare cases did) perform reorderings within a block which could introduce new undefined behavior along paths which didn't previously contain any. Impact wise, we have two major cases where control is not guaranteed to reach a later instruction in the block: may throw calls, and calls containing infinite loops. * The former case was mostly covered by the memory dependencies, and to trigger require a function which can throw, but not write to memory. In theory, such a case is possible, but not likely in practice. * The later case is likely more of an issue in practice. After this code was first written, we changed the IR semantics to allow well defined infinite loops without satisifying mustprogress. Even for C/C++ - which do imply mustprogress - recent changes to how we treat atomics (e.g. an atomic read does not always imply a write) could expose this issue. I'm a bit shocked we don't seem to have a bug report which hit this in real code actually. Compile time wise, this results in a single extra scan of the scheduling window in the common case. Since we stop scanning at the next instruction which isn't guaranteed to execute, no matter what order we traverse instructions in, we scan the block once. The exception to this is that when we extend the scheduling window downwards, we invalidate all dependencies, and thus rescan. So the potentially expensive case is when we a call in a big schedule window which is frequently extended. We could optimize this case (by caching the last instruction not guaranteeed to transfer execution and scanning only the extended window) and starting there), but I decided to leave the complexity until it mattered. That same case is already degenerate with memory dependences which is more expensive than the control dependence scan. We could also consider combining the memory dependence and control dependence sets to reduce memory usage, but since it complicates the code slightly and makes debugging a bit harder, I went with the simplest scheme for now. This was noticed while trying to understand the failures reported against D118538, but is not otherwise related to that change.	2022-03-19 13:36:24 -07:00
Florian Hahn	1a820ff039	[LV] Remove unnecessary uses of Loop* (NFC). Update functions that previously took a loop pointer but only to get the pre-header. Instead, pass the block directly. This removes the requirement for the loop object to be created up-front.	2022-03-19 20:18:47 +00:00
Philip Reames	1093949cff	[SLP] Add comment clarifying assumption that tripped me up [NFC] I keep thinking this assumption is probably exploitable for a bug in the existing implementation, but all of my attempts at writing a test case have failed. So for the moment, just document this very subtle assumption.	2022-03-18 11:40:19 -07:00
Kazu Hirata	3e0f7c7881	[Vectorize] Fix an 'unused function' warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3917:13: error: unused function 'needToScheduleSingleInstruction' [-Werror,-Wunused-function]	2022-03-18 11:24:57 -07:00
Kazu Hirata	b3d8c0d069	[Vectorize] Fix an 'unused variable' warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8148:18: error: unused variable 'SDTE' [-Werror,-Wunused-variable]	2022-03-18 11:24:54 -07:00
Philip Reames	8f108c32bc	Revert "[SLP] Optionally preserve MemorySSA" This reverts commit 1cfa986d68e2f04854ef30c432b8aa28e13a9706. See https://github.com/llvm/llvm-project/issues/54256 for why I'm discontinuing the project. Seperately, it turns out that while this patch does correctly preserve MSSA, it's correct only at the end of the pass; not between vectorization attempts. Even if we decide to resurrect this, we'll need to fix that before reapplying.	2022-03-18 10:45:59 -07:00
Vasileios Porpodas	9136145eb0	Revert "[SLP] Fix lookahead operand reordering for splat loads." due to build failures This reverts commit 5efa78985bf5cbba1c4346ba41a16435fc516446.	2022-03-17 18:22:04 -07:00
Vasileios Porpodas	5efa78985b	[SLP] Fix lookahead operand reordering for splat loads. Splat loads are inexpensive in X86. For a 2-lane vector we need just one instruction: `movddup (%reg), xmm0`. Using the standard Splat score leads to worse code. This patch adds a new score dedicated for splat loads. Please note that a splat is usually three IR instructions: - It is usually a load and 2 inserts: %ld = load double, double* %gep %ins1 = insertelement <2 x double> poison, double %ld, i32 0 %ins2 = insertelement <2 x double> %ins1, double %ld, i32 1 - But it can also be a load, an insert and a shuffle: %ld = load double, double* %gep %ins = insertelement <2 x double> poison, double %ld, i32 0 %shf = shufflevector <2 x double> %ins, <2 x double> poison, <2 x i32> zeroinitializer Because of this some of the lit tests contain more IR instructions. Differential Revision: https://reviews.llvm.org/D121354	2022-03-17 18:05:54 -07:00
Alexey Bataev	d65cc85977	[SLP]Do not schedule instructions with constants/argument/phi operands and external users. No need to schedule entry nodes where all instructions are not memory read/write instructions and their operands are either constants, or arguments, or phis, or instructions from others blocks, or their users are phis or from the other blocks. The resulting vector instructions can be placed at the beginning of the basic block without scheduling (if operands does not need to be scheduled) or at the end of the block (if users are outside of the block). It may save some compile time and scheduling resources. Differential Revision: https://reviews.llvm.org/D121121	2022-03-17 11:03:45 -07:00
Florian Hahn	151c144350	[LV] Use usesScalars in widenPHIInstruction. This uses the existing VPlan helpers to check whether there are scalar uses of a phi recipe. It remove one of the few remaining dependencies on the cost model from VPlan code generation. Depends on D121612. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121613	2022-03-17 13:16:32 +00:00
Florian Hahn	a6e70e4056	[VPlan] VPInterleaveRecipe only requires the first lane of the address. VPInterleaveRecipe only uses the first lane of the address. Add onlyFirstLaneUsed implementation. This is needed for a follow-up patch. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121612	2022-03-17 11:56:43 +00:00
Nikita Popov	1dbeb64493	[SLP] Avoid unnecessary getIncomingValueForBlock() call (NFC) This code just wants to check all incoming values, we don't care care what the incoming block is here.	2022-03-17 12:23:46 +01:00
Alexey Bataev	150ea76543	Revert "[SLP]Do not schedule instructions with constants/argument/phi operands and external users." This reverts commit 1eeb2bfe727323332800e8d390f2f8c63c953779 to fix a bug reported in https://reviews.llvm.org/D121121	2022-03-16 13:54:59 -07:00
Malhar Jajoo	a36d269658	[VPlan] Avoid collecting scalars for SVE This patch ensures scalars (except for uniforms) are no longer collected (prior to LVP planning phase) for scalable vectorization. This is to avoid the chances of generating scalarized instructions later (during LVP execute phase) as they are not supported for scalable vectorization. Relevant test has also been added. Differential Revision: https://reviews.llvm.org/D121452	2022-03-16 16:33:34 +00:00
Alexey Bataev	1eeb2bfe72	[SLP]Do not schedule instructions with constants/argument/phi operands and external users. No need to schedule entry nodes where all instructions are not memory read/write instructions and their operands are either constants, or arguments, or phis, or instructions from others blocks, or their users are phis or from the other blocks. The resulting vector instructions can be placed at the beginning of the basic block without scheduling (if operands does not need to be scheduled) or at the end of the block (if users are outside of the block). It may save some compile time and scheduling resources. Differential Revision: https://reviews.llvm.org/D121121	2022-03-16 06:05:43 -07:00
Philip Reames	1cfa986d68	[SLP] Optionally preserve MemorySSA This initial patch adds code to preserve MemorySSA through a run of SLP vectorizer. The eventual plan is to use MemorySSA to accelerate SLP's memory dependence checking, but we're a ways from that. In particular, this patch is correct, but really slow. It's being landed so that we can work incrementally in tree, not because it's expected to be useful to anyone just yet. The broader effort is being tracked in https://github.com/llvm/llvm-project/issues/54256. Its worth noting expicitly that this may not work out, and if not, we will be reverting all of the MSSA support in SLP at some point in the next few weeks. Differential Revision: https://reviews.llvm.org/D117926	2022-03-15 16:36:15 -07:00
Florian Hahn	ca1b2fc9fb	[LV] Remove LoopVectorBody from InnerLoopVectorizer. (NFCI) Update places still referencing LoopVectorBody to use the vector loop to get the vector loop header. This is needed to move vector loop code-generation to VPlan completely, which in turn is needed to model pre-header & exit blocks in VPlan as well.	2022-03-15 08:22:31 +00:00
Florian Hahn	4a0481e981	[LV] Check for users of truncated IVs, add more detailed comment. Add missing outside user check for truncated IVs. Also hoist the code in the helper with additional explanations. Fixes #54370.	2022-03-14 19:39:30 +00:00
Nikita Popov	8361c5da30	[SLPVectorizer] Handle external load/store pointer uses with opaque pointers In this case we may not generate a bitcast, so the new load/store becomes the external user.	2022-03-14 16:55:09 +01:00
Florian Hahn	d621ae30e2	[LV] Remove dead Loop argument from emitMinimumVector... (NFC) The argument is not used, remove it.	2022-03-14 15:47:40 +00:00
Florian Hahn	3ee2d908a9	[LV] Remove dead Loop argument from emitSCEVChecks. (NFC) The argument is not used, remove it.	2022-03-14 13:00:03 +00:00
Florian Hahn	8896c36624	[LV] Do not set insert point in completeLoopSkeleton. (NFCI) The insertion point for the builder used during VPlan code generation is set during code generation. Setting the insert point here is dead code and can be removed.	2022-03-14 12:21:26 +00:00
Florian Hahn	1c0fc1f074	[VPlan] Ensure each iv user is only visited once in transform. If a recipe has multiple uses of an IV, we crash. It causes a crash when building llvm-test-suite. Exposed by 95f76bff1c40bc1c2f.	2022-03-13 21:42:17 +00:00
Florian Hahn	95f76bff1c	[LV] Create & use VPScalarIVSteps for all scalar users. This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and scalar uses. It updates all uses that only need scalar values to use the newly created recipe for the scalar steps. This completes untangling of VPWidenIntOrFpInductionRecipe code-generation. Now the recipe only creates the widened vector values, as it says on the tin. The code to genereate IR has been moved directly to VPWidenIntOrFpInductionRecipe::execute. Note that the recipe has been updated to hold a reference to ScalarEvolution, which is needed to expand the step, until we can place the corresponding SCEV expansion in the pre-header. Depends on D120827. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D120828	2022-03-13 17:15:24 +00:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Florian Hahn	d3e1094473	[VPlan] Implement VPCanonicalIVPHIRecipe::onlyFirstLaneUsed. The recipe only uses the first lane of its operands. Suggested & split off D120827.	2022-03-11 18:07:26 +00:00
Florian Hahn	ecea477df3	[VPlan] Helper to check if a recipe uses scalar values of op. This patch adds a helper to check if a recipe only uses scalars of a given operand. This is similar to onlyFirstLaneUsed, which was introduced earlier. By default, usesScalars falls back on onlyFirstLaneUsed. Will be used by D120828. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D120827	2022-03-11 13:41:08 +00:00
Florian Hahn	a12403cfea	[LV] Do not consider instrs dead if used by phi that's not in plan. Single value phis won't be modeled in VPlan. If the phi only gets used outside the loop, the current code misses the fact that the incoming value is not dead. Update the code to also look through such phis to check for outside users. Fixes #54266	2022-03-09 16:04:44 +00:00
Philip Reames	a2e9c68fcd	[SLP] Extract a helper for buildvector [nfc]	2022-03-07 19:11:40 -08:00
Philip Reames	8ab3befa3f	[SLP] Fix spelling in a lambda name [NFC]	2022-03-07 18:52:57 -08:00

... 9 10 11 12 13 ...

3535 Commits