llvm-project

Author	SHA1	Message	Date
Craig Topper	6006d43e2d	LLVM_FALLTHROUGH => [[fallthrough]]. NFC Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150996	2023-05-24 12:40:10 -07:00
Florian Hahn	299f0ff60e	[VPlan] Print IR flags for VPRecipeWithIRFlags. Now that IR flags are modeled as part of VPRecipeWithIRFlags, include the flags when printing recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150029	2023-05-23 20:36:16 +01:00
Alexey Bataev	ae5ff3ca0c	[SLP]Fix PR62665: compiler crash when trying to access non-existing mask element. Need to check at first if the SubMask element is PoisonMaskElem to avoid compiler crash.	2023-05-22 13:43:25 -07:00
Luke Lau	c27a0b21c5	[SLP][RISCV] Account for offset folding in getPointersChainCost For a GEP in a pointer chain, if: 1) a pointer chain is unit-strided 2) the base pointer wasn't folded and is sitting in a register somewhere 3) the distance between the GEP and the base pointer is small enough and can be folded into the addressing mode of the using load/store Then we can exclude that GEP from the total cost of the pointer chain, as it will likely be folded away. In order to check if 3) holds, we need to know the type of memory access being made by the users of the pointer chain. For that, we need to pass along a new argument to getPointersChainCost. (Using the source pointer type of the GEP isn't accurate, see https://reviews.llvm.org/D149889 for more details). Also note that 2) is currently an assumption, and could be modelled more accurately. This prevents some unprofitable cases from being SLP vectorized on RISC-V by making the scalar costs cheaper and closer to the actual codegen. For now the getPointersChainCost hook is duplicated for RISC-V to prevent disturbing other targets, but could be merged back in and shared with other targets in a following patch. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149654	2023-05-22 13:55:30 +01:00
Florian Hahn	8eaf7a75fe	[VPlan] Add missing ifdef after 96686796f606. Fixes build with debug printing disabled.	2023-05-22 10:44:17 +01:00
Florian Hahn	96686796f6	[VPlan] Move live-out printing to VPLiveOut::print (NFC). Preparation for D150398. This brings live-out printing in line with how printing for recipes is handled.	2023-05-22 09:53:53 +01:00
Vasileios Porpodas	806dea46be	[SLP] Cleanup: Remove `tryToVectorizePair()`, most probably NFC `tryToVectorizePair()` adds a level of indirection over `tryToVectorizeList()`. I am not really sure why it is needed, it looks redundant. I replaced all calls to `tryToVectorizePair()` with calls to `tryToVectorizeList()` and I am not seeing any failures. Differential Revision: https://reviews.llvm.org/D151004	2023-05-19 20:25:20 -07:00
Vasileios Porpodas	338fc76200	[SLP][NFC] Cleanup: Remove KeyNodes set. I don't see a good reason form having the `KeyNodes` set. This patch removes the set. Differential Revision: https://reviews.llvm.org/D150918	2023-05-19 10:30:02 -07:00
Florian Hahn	55903151a2	[VPlan] Use isUniformAfterVec in VPReplicateRecipe::execute. I was unable to find a case where this actually changes generated code, but it enables the bug fix in D144434. It also brings codegen in line with the handling of stores to uniform addresses in the cost model (D134460). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144491	2023-05-19 18:15:21 +01:00
Alexey Bataev	9a7248f561	[SLP]Fix crash for scalarized vectors. Need to remove insertion of the nodes to the InVector in case of scalarized vectors too to avoid compiler crashes.	2023-05-17 06:32:22 -07:00
Luke Lau	d088b8af93	[SLP] Rename IsUniformStride to IsUnitStride. NFCI IsUniformStride is only used when the stride is a unit-stride, i.e. in a plain wide vector load. This tightens the condition and renames it to isUnitStride. It removes the old unused getUniformStrided() variant, as isUnitStride should now imply that the stride is known. Reviewed By: vdmitrie, ABataev Differential Revision: https://reviews.llvm.org/D150662	2023-05-17 13:21:33 +01:00
Alexey Bataev	6c7acc6409	[SLP][NFC]Add missing finalize params in the CostEstimator, NFC. Prepare functions for generalization of codegen/cost estimation. Differential Revision: https://reviews.llvm.org/D150121	2023-05-15 11:17:37 -07:00
Vasileios Porpodas	ddb2188afc	[SLP][NFC] Cleanup: Separate vectorization of Inserts and CmpInsts. This deprecates `vectorizeSimpleInstructions()` and replaces it with separate functions that vectorize CmpInsts and Inserts. Differential Revision: https://reviews.llvm.org/D149993	2023-05-15 10:12:34 -07:00
Florian Hahn	701f7230cd	[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map Update VPReplicateRecipe to use VPRecipeWithIRFlags for IR flag handling. Retire separate MayGeneratePoisonRecipes map. Depends on D149082. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150027	2023-05-15 11:49:20 +01:00
Florian Hahn	f40a7901d1	[LV] Move selecting vectorization factor logic to LVP (NFC). Split off from D143938. This moves the planning logic to select the vectorization factor to LoopVectorizationPlanner as a step towards only computing costs for individual VFs in LoopVectorizationCostModel and do planning in LVP. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150197	2023-05-13 12:28:14 +01:00
Florian Hahn	7472f1da96	[VPlan] Change LoopVectorizationPlanner::TTI to be const reference (NFC)	2023-05-13 12:27:57 +01:00
Florian Hahn	0418d0242b	[LV] Move getVScaleForTuning out of LoopVectorizationCostModel (NFC). Split off refactoring from D150197 to reduce diff.	2023-05-13 10:17:13 +01:00
Philip Reames	592199c8fe	[LV] Use interface routines instead of internal variables This makes a (possible) change to the internal representation easier in the future, and makes the code easier to read now.	2023-05-12 16:27:12 -07:00
Florian Hahn	bf279a0f8e	[VPlan] Remove dangling comment and newlines (NFC). Apply missed cleanups.	2023-05-11 22:06:56 +01:00
Florian Hahn	3d4eed0133	[LV] Reuse SCEV expansion results for epilogue vectorization. When generating code for the epilogue vector loop, we need to re-use the expansion results for induction steps generated for the main vector loop, as the pre-header of the epilogue vector loop may not dominate the vector preheader of the epilogue. This fixes a reported crash. Note that this is a workaround which should be removed soon once induction resume value creation is handled in VPlan directly.	2023-05-11 22:00:07 +01:00
Philip Reames	7fbfcc653f	[LV/LAA] Use PSE to identify stride multiplies which simplify [mostly nfc] LV/LAA will speculate that (some) strided access patterns have unit stride, and insert runtime checks if required. LV cost models a multiply by such a stride as free. We did this by keeping around the StrideSet structure, just to check if one of the operands were one of the strides we speculated. We can instead just ask PredicatedScalarEvolution if either of the operands are one (after predicates are applied). We get mostly the same result - PSE can prove it in more cases in theory - and simpler code.	2023-05-11 11:16:04 -07:00
Philip Reames	e41dce4d49	[LAA/LV] Simplify stride speculation logic [NFC] (try 2) The original commit wasn't quite NFC, and this was caught by an arguably overly strong assert. Specifically, I'd failed to strip off the integer cast off the SCEV before saving it in the map. The result - other than a failed assert - is that we'd speculate on the casted unknown, not the unknown. The only case I can think of where that might change behavior would be a sext(i1 load). I doubt that case is interesting in practice, but it's good to be strictly NFC on this change regardless. Original commit message follows.. The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 10:19:23 -07:00
Philip Reames	dc0d00c5fc	Revert "[LAA/LV] Simplify stride speculation logic [NFC]" This reverts commit d5b840131223f2ffef4e48ca769ad1eb7bb1869a. Running this through broader testing after rebasing is revealing a crash. Reverting while I investigate.	2023-05-11 09:26:35 -07:00
Florian Hahn	236a0e82df	[LV] Use VPValue to get expanded value for SCEV step expressions. Update skeleton creation logic to use SCEV expansion results from expanding the pre-header. This avoids another set of SCEV expansions that may happen after the CFG has been modified. Fixes #58811. Depends on D147964. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147965	2023-05-11 16:49:19 +01:00
Philip Reames	d5b8401312	[LAA/LV] Simplify stride speculation logic [NFC] The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 08:32:56 -07:00
Hongtao Yu	9272d0f079	[PseudoProbe] Clean up dwarf discriminator and avoid duplicating factor. A pseudo probe is created with dwarf line information shared with its nearest instruction. If the instruction comes with a dwarf discriminator, it will be shared with the probe as well. This can confuse the later FS-AFDO discriminator assignment pass. To fix this, I'm cleaning up the discriminator fields for probes when they are inserted. I also notice another possibility to change the discriminator field of pseudo probes in the pipeline before the FS discriminator assignment pass. That is the loop unroller, which assigns duplication factor to instruction being vectorized. I'm disabling that for pseudo probe intrinsics specifically, also for callsites with probes. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D148569	2023-05-10 11:26:23 -07:00
Vasileios Porpodas	dda2a5d457	[SLP][NFC] Rename a couple of variables and replace an if-else with an std::min - Rename `LimitForRegisterSize` to `MaxVFOnly` to make the meaning of the limit less ambiguous - Rename `OpsWidth` to `ActualVF`, which makes it clear that this is the VF we are using for vectorization. - Replace the if-else code for the initialization of OpsWidth with an std::min. Differential Revision: https://reviews.llvm.org/D150241	2023-05-10 09:37:58 -07:00
Florian Hahn	c096e91735	[VPlan] Address missed suggestions from D149082. This address 2 comments missed from D149082. It sets inbounds directly when creating the GEP and fixes the order in the enum.	2023-05-09 15:17:20 +01:00
Florian Hahn	5f3343985b	[VPlan] Use VPRecipeWithIRFlags for VPWidenGEPRecipe (NFCI). Extend VPRecipeWithIRFlags to also include InBounds and use for VPWidenGEPRecipe. The last remaining recipe that needs updating for MayGeneratePoisonRecipes is VPReplicateRecipe. Depends on D149081. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149082	2023-05-09 12:33:28 +01:00
Florian Hahn	127b00b25c	[VPlan] Record IR flags on VPWidenRecipe directly (NFC). This patch introduces a VPRecipeWithIRFlags class to record various IR flags for a recipe. This allows de-coupling of IR flags from the underlying instructions. The main benefit is that it allows dropping of IR flags from recipes directly, without the need to go through State::MayGeneratePoisonRecipes. The plan is to remove MayGeneratePoisonRecipes once all relevant recipes are transitioned. It also allows dropping IR flags during VPlan-to-VPlan transforms, which will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149079	2023-05-08 17:28:50 +01:00
Florian Hahn	823d35fd3b	[VPlan] Use RecipeBuilder to look up member when fixing IG (NFC). Recipes for interleave group members are recorded directly in the RecipeBuilder. Use it directly instead of going indirectly through VPlan's Value->VPValue mapping.	2023-05-07 18:02:27 +01:00
Florian Hahn	7b7be685d4	[VPlan] Use operands directly in VPInstructionsToVPRecipes (NFC). New that def-use chains are modeled directly in VPlan, we can simply use the operands of the recipe we are replacing. There is no need to use the operands of the underlying instruction to look up a VPValue.	2023-05-06 12:36:00 +01:00
Florian Hahn	01fa764c9a	[VPlan] Assert instead of check if VF is vector when widening GEPs(NFC) VPWidenGEPRecipe should not be generated for scalar VFs. Replace check with an assert.	2023-05-06 09:25:56 +01:00
Kazu Hirata	2b60bd5141	[Vectorize] Use Densemap::contains (NFC)	2023-05-06 00:02:54 -07:00
Alexey Bataev	2672c6e4dc	[SLP][NFC]Add processBuildVector member function, NFC. Introduce processBuildVector as a next step to generalize code for cost estimation and code emission for gather/buildvector nodes. Differential Revision: https://reviews.llvm.org/D149973	2023-05-05 11:00:53 -07:00
Florian Hahn	8bd02e5aef	[VPlan] Assert instead checking if VF is vec when widening calls (NFC) VPWidenCallRecipe should not be generated for scalar VFs. Replace check with an assert.	2023-05-05 18:21:57 +01:00
Vasileios Porpodas	7749f6e976	[SLP][NFC] Cleanup: Outline the code that vectorizes CmpInsts into a seaparate function. Differential Revision: https://reviews.llvm.org/D149919	2023-05-05 09:56:41 -07:00
Alexey Bataev	ca3f4236e4	[SLP][NFC]Add/use gather and createFreeeze member functions in ShuffleInstructionBuilder, NFC.	2023-05-05 09:12:54 -07:00
Florian Hahn	e3afe0b89d	[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI). To generate cast instructions, the result type is needed. To allow creating widened casts without underlying instruction, introduce a new VPWidenCastRecipe that also holds the result type. This functionality will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149081	2023-05-05 13:20:16 +01:00
Florian Hahn	29712ccda6	[VPlan] Assert instead of check if VF is vector when widening casts. VPWidenRecipes should not be generated for scalar VFs. Replace check with an assert. Suggested in preparation for D149081.	2023-05-05 09:02:33 +01:00
Alexey Bataev	d726f99d43	[SLP][NFC]Do not try to revectorize instructions with constant operands, NFC. The pass should not try to revectorize instructions with constant operands, which were not folded by the IRBuilder. It prevents the non-terminating loop in the SLP vectorizer for non foldable constant operations.	2023-05-04 13:52:42 -07:00
Florian Hahn	1b05e74982	[VPlan] Reorder cases in switch (NFC). Reorder cases to make sure they are ordered properly in preparation for D149081.	2023-05-04 21:40:22 +01:00
Florian Hahn	c2bef381fa	[VPlan] Remove setEntry to avoid leaks when replacing entry. Update the HCFG builder to directly connect the created CFG to the existing Plan's entry. This allows removing `setEntry`, which can cause leaks when the existing entry is replaced. Should fix https://lab.llvm.org/buildbot/#/builders/5/builds/33455/steps/13/logs/stdio	2023-05-04 19:12:02 +01:00
Alexey Bataev	c0e5e7db9a	[SLP]Fix a crash trying finding insert point for GEP nodes with non-gep insts. If the vectorizable GEP node is built, which should not be scheduled, and at least one node is a non-gep instruction, need to insert the vectorized instructions before the last instruction in the list, not before the first one, otherwise the instructions may be emitted in the wrong order.	2023-05-04 09:43:37 -07:00
Florian Hahn	147a56149c	[VPlan] Clean up preheader block after b85a402dd899fc. Fix a leak introduced in b85a402dd899fc and flagged by LSan https://lab.llvm.org/buildbot#builders/5/builds/33452	2023-05-04 16:29:57 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	79692750d2	[LV] Use VPValue for SCEV expansion in fixupIVUsers. The step is already expanded in the VPlan. Use this expansion instead. This is a step towards modeling fixing up IV users in VPlan. It also fixes a crash casued by SCEV-expanding the Step expression in fixupIVUsers, where the IR is in an incomplete state Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147963	2023-05-04 09:25:59 +01:00
Vasileios Porpodas	ce46e1aa76	[NFC][SLP] Cleanup: Simplify traversal loop in SLPVectorizerPass::vectorizeHorReduction(). This includes a couple of changes: 1. Moves the code that changes the root node out of the `TryToReduce` lambda and out of the traversal loop. 2. Since that code moved, there isn't much left in `TryToReduce` so the code was inlined. 3. The phi node variable `P` was also being used as a flag that turns on/off the exploration of operands as new seeds. This patch uses a new variable `TryOperandsAsNewSeeds` for this. 4. Simplifies the code executed when vectorization fails. The logic of the code should be identical to the original, but I may be missing something not caught by tests. Differential Revision: https://reviews.llvm.org/D149627	2023-05-03 12:48:01 -07:00
Alexey Bataev	d62734800c	[SLP][NFC]Add ShuffleCostBuilder and generalize BaseShuffleAnalysis::createShuffle function, NFC. Added basic implementation of ShuffleCostBuilder class in ShuffleCostEstimator and generalized BaseShuffleAnalysis::createShuffle function to support emission of Value */InstructionCost for the vectorization/cost estimation. Differential Revision: https://reviews.llvm.org/D149171	2023-05-03 12:30:54 -07:00
Florian Hahn	b9efffa7e9	[VPlan] Add assignSlot(const VPBasicBlock *) (NFC). Factor out utility to simplify D147964 as sugested.	2023-05-03 19:51:09 +01:00

1 2 3 4 5 ...

3791 Commits