llvm-project

Author	SHA1	Message	Date
Florian Hahn	654bb4e9f2	[LV] Don't consider branches leaving loop in collectValuesToIgnore. Branches exiting the loop will remain regardless, so don't consider them in collectValuesToIgnore. This fixes another divergence between legacy and VPlan-based cost model. Fixes https://github.com/llvm/llvm-project/issues/106780.	2024-09-01 20:35:36 +01:00
Florian Hahn	f0e34f3818	[VPlan] Don't skip optimizable truncs in planContainsAdditionalSimps. A optimizable cast can also be removed by VPlan simplifications. Remove the restriction from planContainsAdditionalSimplifications, as this causes it to miss relevant simplifications, triggering false positives for the cost decision verification. Also adds debug output for printing additional cost-precomputations. Fixes https://github.com/llvm/llvm-project/issues/106641.	2024-08-30 11:29:30 +01:00
Florian Hahn	c4906588ce	[VPlan] Use skipCostComputation when pre-computing induction costs. This ensures we skip any instructions identified to be ignored by the legacy cost model as well. Fixes a divergence between legacy and VPlan-based cost model. Fixes https://github.com/llvm/llvm-project/issues/106417.	2024-08-29 21:20:00 +01:00
Maciej Gabka	95d2d1cba0	Move stepvector intrinsic out of experimental namespace (#98043 ) This patch is moving out stepvector intrinsic from the experimental namespace. This intrinsic exists in LLVM for several years now, and is widely used.	2024-08-28 12:48:20 +01:00
Mel Chen	dfde1a7232	[LV][NFC] Update and clean up the test case LoopVectorize/RISCV/inloop-reduction.ll. (#102907 )	2024-08-28 17:46:58 +08:00
Florian Hahn	cb4efe1d07	[VPlan] Don't trigger VF assertion if VPlan has extra simplifications. There are cases where VPlans contain some simplifications that are very hard to accurately account for up-front in the legacy cost model. Those cases are caused by un-simplified inputs, which trigger the assert ensuring both the legacy and VPlan-based cost model agree on the VF. To avoid false positives due to missed simplifications in general, only trigger the assert if the chosen VPlan doesn't contain any additional simplifications. Fixes https://github.com/llvm/llvm-project/issues/104714. Fixes https://github.com/llvm/llvm-project/issues/105713.	2024-08-22 21:38:06 +01:00
Florian Hahn	4e04286d61	[VPlan] Only use selectVectorizationFactor for cross-check (NFCI). (#103033 ) Use getBestVF to select VF up-front and only use selectVectorizationFactor to get the VF legacy VF to check the vectorization decision matches the VPlan-based cost model. PR: https://github.com/llvm/llvm-project/pull/103033	2024-08-21 13:09:01 +02:00
Florian Hahn	99741ac285	[VPlan] Introduce explicit ExtractFromEnd recipes for live-outs. (#100658 ) Introduce explicit ExtractFromEnd recipes to extract the final values for live-outs instead of implicitly extracting in VPLiveOut::fixPhi. This is a follow-up to the recent changes of modeling extracts for recurrences and consolidates live-out extract creation for fixed-order recurrences at a single place: addLiveOutsForFirstOrderRecurrences. It is also in preparation of replacing VPLiveOut with VPIRInstructions wrapping the original scalar phis. PR: https://github.com/llvm/llvm-project/pull/100658	2024-08-21 10:06:44 +02:00
Florian Hahn	e9e3a183d6	[LV] Don't cost branches and conditions to empty blocks. Update the legacy cost model skip branches with successors blocks that are empty or only contain dead instructions, together with their conditions. Such branches and conditions won't result in any generated code and will be cleaned up by VPlan transforms. This fixes a difference between the legacy and VPlan-based cost model. When running LV in its usual pipeline position, such dead blocks should already have been cleaned up, but they might be generated manually or by fuzzers. Fixes https://github.com/llvm/llvm-project/issues/100591.	2024-08-18 12:51:17 +01:00
Florian Hahn	c7a44ec031	[VPlan] Check successors in VPlan to check if scalar epi required (NFC) Now that the branches to the scalar epilogue are modeled in VPlan directly, check the VPlan to see if a scalar epilogue is required. Preparation for https://github.com/llvm/llvm-project/pull/100658.	2024-08-12 15:33:52 +01:00
Craig Topper	59728193a6	[RISCV] Disable fixed length vectors with Zve32* without Zvl64b. (#102405 ) Fixed length vectors use scalable vector containers. With Zve32* and not Zvl64b, vscale is a 0.5 due RVVBitsPerBlock being 64. To support this correctly we need to lower RVVBitsPerBlock to 32 and change our type mapping. But we need to RVVBitsPerBlock to alway be >= ELEN. This means we need two different mapping depending on ELEN. That is a non-trivial amount of work so disable fixed lenght vectors without Zvl64b for now. We had almost no tests for Zve32x without Zvl64b which is probably why we never realized that it was broken. Fixes #102352.	2024-08-08 09:17:43 -07:00
Philip Reames	d3fd28a134	[RISCV][TTI] Properly model odd vector sized LD/ST operations (#100436 ) The motivation for this change is the costing of a LD or ST with nearly power of 2 vectors (e.g. <3 x i32> or <7 x i32>) on V. There's an experimental option in SLP to allow emitting these if the cost model says they're profitable. This really helps with e.g. RGB vectors. Our actual lowering for these depends on whether a wider container type is known available. If so, we use a vle or vse on the wider type with a restricted VL. If not, we split until a legal type is found, and then apply the vle/vse on the sub-pieces. This change is intentionally restricted to only the case where promotion (widening w/VL predication) is involved. We appear to have at least one bug in our splitting lowering (see discussion on review), and to avoid exposing this more widely, I chose to not adjust costs for the splitting case. The current splitting costing assumes scalarization (which is not true of the actual lowering), but that has the effect of biasing vectorization away from such cases strongly. For the widening case, the true cost scales with the next largest legal type. The default implementation assumes that such a type is scalarized. Changing that brings our cost in line with our actual lowering decision. Note that since scalarization is not possible for scalable types, the prior costing falsely returned Invalid for that case.	2024-07-26 12:52:20 -07:00
Florian Hahn	67a55e01e3	[VPlan] Replace getBestPlan by getBestVF use also for epilogue vec. (#98821 ) Replace getBestPlan by getBestVF which simply finds the best VF out of the VFs for the available VPlans. Then use getBestPlan to retrieve the corresponding VPlan. This allows using getBestVF & getBestPlan for epilogue vectorization as well. As the same plan may be used to vectorize both the main and epilogue loop, restricting the VF of the best plan would cause issues. PR: https://github.com/llvm/llvm-project/pull/98821	2024-07-26 14:06:46 +01:00
Alexey Bataev	7432ad6af5	[LV][VP][NFC]Add tests for safe store/load forwarding/dependence distance. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/100635	2024-07-25 19:55:37 -04:00
Philip Reames	ea202f9f2e	[LV,RISCV] Regenerate a test to reduce spurious deltas in upcoming change	2024-07-25 12:22:59 -07:00
Florian Hahn	b72689a5cb	[LV] Ignore live-out users in cost model if scalar epilogue is required. Follow-up to ba8126b6fef79. If a scalar epilogue is required, users outside the loop won't use live-outs from the vector loop but from the scalar epilogue. Ignore them if that is the case. This fixes another case where the VPlan-based cost-model more accurately computes cost. Fixes https://github.com/llvm/llvm-project/issues/100464.	2024-07-25 11:16:18 +01:00
Florian Hahn	ba8126b6fe	[LV] Mark dead instructions in loop as free. Update collectValuesToIgnore to also ignore dead instructions in the loop. Such instructions will be removed by VPlan-based DCE and won't be considered by the VPlan-based cost model. This closes a gap between the legacy and VPlan-based cost model. In practice with the default pipelines, there shouldn't be any dead instructions in loops reaching LoopVectorize, but it is easy to generate such cases by hand or automatically via fuzzers. Fixes https://github.com/llvm/llvm-project/issues/99701.	2024-07-24 09:31:32 +01:00
Luke Lau	58854facb3	[RISCV] Don't cost vector arithmetic fp ops as cheaper than scalar (#99594 ) I was comparing some SPEC CPU 2017 benchmarks across rva22u64 and rva22u64_v, and noticed that in a few cases that rva22u64_v was considerably slower. One of them was 519.lbm_r, which has a large loop that was being unprofitably vectorized. It has an if/else in the loop which requires large amounts of predication when vectorized, but despite the loop vectorizer taking this into account the vector cost came out as cheaper than the scalar. It looks like the reason for this is because we cost scalar floating point ops as 2, but their vector equivalents as 1 (for LMUL 1). This comes from how we use BasicTTIImpl for scalars which treats floats as twice as expensive as integers. This patch doubles the cost of vector floating point arithmetic ops so that they're at least as expensive as their scalar counterparts, which gives a 13% speedup on 519.lbm_r at -O3 on the spacemit-x60. Fixes #62576 (the last point there about scalar fsub/fmul)	2024-07-22 13:56:10 +08:00
Mel Chen	4eb30cfb34	[LV][EVL] Support in-loop reduction using tail folding with EVL. (#90184 ) Following from #87816, add VPReductionEVLRecipe to describe vector predication reduction. Address one of TODOs from #76172.	2024-07-16 16:15:24 +08:00
Mel Chen	a00754bb2a	[LV] Fix the cost of min/max reductions. (#98453 ) This patch updates the function `getReductionPatternCost` to handle the cost of min/max reductions by `TTI.getMinMaxReductionCost`.	2024-07-12 13:47:33 +08:00
Florian Hahn	9a5a8731e7	[VPlan] Introduce ResumePhi VPInstruction, use to create phi for FOR. (#94760 ) This patch introduces a new ResumePhi VPInstruction which creates a phi in a leaf block of a VPlan. The first use is to create the phi node for fixed-order recurrence resume values in the scalar preheader. The VPInstruction takes 2 operands: 1) the incoming value from the middle-block and a default value to be used for all other incoming blocks. In follow-up changes, it will also be used to create phis for reduction and induction resume values. Depends on https://github.com/llvm/llvm-project/pull/92651 PR: https://github.com/llvm/llvm-project/pull/94760	2024-07-11 16:08:04 +01:00
Florian Hahn	67f4968a57	[LV] Skip cost for ZExt/SExts that will be removed by truncating ops. If an extend is truncated, it will be removed if the result type is <= the source type, as there is nothing to extend. Return a cost of 0. This was caught by the first step to perform cost-modeling based on VPlan (b841e2e), as the legacy cost model would query the cost of an invalid extend, while the extend has been folded away by VPlan transforms. Fixes https://github.com/llvm/llvm-project/issues/98413.	2024-07-11 11:40:14 +01:00
Florian Hahn	b841e2eca3	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92. A number of crashes have been fixed by separate fixes, including ttps://github.com/llvm/llvm-project/pull/96622. This version of the PR also pre-computes the costs for branches (except the latch) instead of computing their costs as part of costing of replicate regions, as there may not be a direct correspondence between original branches and number of replicate regions. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-07-10 14:22:21 +01:00
Florian Hahn	0577cdaa32	[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612 ) Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps. Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected. PR: https://github.com/llvm/llvm-project/pull/77612	2024-07-08 16:34:42 +01:00
Florian Hahn	99d6c6d936	[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651 ) This patch moves branch condition creation to enter the scalar epilogue loop to VPlan. Modeling the branch in the middle block also requires modeling the successor blocks. This is done using the recently introduced VPIRBasicBlock. Note that the middle.block is still created as part of the skeleton and then patched in during VPlan execution. Unfortunately the skeleton needs to create the middle.block early on, as it is also used for induction resume value creation and is also needed to properly update the dominator tree during skeleton creation. After this patch lands, I plan to move induction resume value and phi node creation in the scalar preheader to VPlan. Once that is done, we should be able to create the middle.block in VPlan directly. This is a re-worked version based on the earlier https://reviews.llvm.org/D150398 and the main change is the use of VPIRBasicBlock. Depends on https://github.com/llvm/llvm-project/pull/92525 PR: https://github.com/llvm/llvm-project/pull/92651	2024-07-05 10:08:42 +01:00
Florian Hahn	2b3b405b09	[LV] Don't vectorize first-order recurrence with VF <vscale x 1 x ..> The assertion added as part of https://github.com/llvm/llvm-project/pull/93395 surfaced cases where first-order recurrences are vectorized with <vscale x 1 x ..>. If vscale is 1, then we are unable to extract the penultimate value (second to last lane). Previously this case got mis-compiled, trying to extract from an invalid lane (-1) https://llvm.godbolt.org/z/3adzYYcf9. Fixes https://github.com/llvm/llvm-project/issues/97452.	2024-07-04 11:44:51 +01:00
Kolya Panchenko	49e5cd2acc	[LV][NFC] Marked functions as const. Added LLVM_DEBUG. (#96681 )	2024-06-26 17:38:18 -04:00
Florian Hahn	8681bb8bed	[LV] Add additional test coverage for cost modeling. Add missing tests uncovered by https://github.com/llvm/llvm-project/pull/92555. Includes test for https://github.com/llvm/llvm-project/issues/96294 and https://github.com/llvm/llvm-project/issues/96328	2024-06-26 10:18:01 +01:00
Florian Hahn	abf5969f76	[VPlan] Don't compute costs if there are no vector VPlans. In some cases, no vector VPlans can be constructed due to failing VPlan legality checks (e.g. unable to perform sinking for first order recurrences or plans being incompatible with EVL). There's no need to compute costs in those cases, so check directly if there are no vector plans.	2024-06-24 08:38:31 +01:00
Florian Hahn	f0c674f680	[LV] Add test showing cost is computed when there are no vector plans. Add test showing unnecessary cost computations, as no vector VPlans are generated.	2024-06-24 08:08:56 +01:00
Florian Hahn	f1f3c34b47	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 242cc200ccb24e22eaf54aed7b0b0c84cfc54c0b and eea150c84053035163f307b46549a2997a343ce9, as it is causing a build bot failure and there have been a number of crashes reported at https://github.com/llvm/llvm-project/pull/92555	2024-06-21 19:54:21 +01:00
Florian Hahn	242cc200cc	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92. Extra tests for crashes discovered when building Chromium have been added in fb86cb7ec157689e, 3be7312f81ad2. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-20 17:32:52 +01:00
Florian Hahn	3808ba78de	[VPlan] Model middle block via VPIRBasicBlock. (#95816 ) Use VPIRBasicBlock to wrap the middle block and implement patching up branches in predecessors in VPIRBasicBlock::execute. The IR middle block is only created after skeleton creation. Initially a regular VPBasicBlock is created, which will later be replaced by a VPIRBasicBlock once the middle IR basic block has been created. Note that this slightly changes the order of instructions created in the middle block; code generated by recipe execution in the middle block will now be inserted before the terminator (and in between the compare to used by the terminator). The original order will be restored in https://github.com/llvm/llvm-project/pull/92651. PR: https://github.com/llvm/llvm-project/pull/95816	2024-06-20 13:42:20 +01:00
Arthur Eubanks	6f538f6a2d	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264. This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-14 17:47:08 +00:00
Florian Hahn	90fd99c079	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c. Extra tests have been added in 52d29eb287. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-14 12:33:48 +01:00
Jay Foad	d4a0154902	[llvm-project] Fix typo "seperate" (#95373 )	2024-06-13 20:20:27 +01:00
Arthur Eubanks	46080abe9b	Revert "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-13 16:37:21 +00:00
Florian Hahn	00798354c5	[VPlan] First step towards VPlan cost modeling. (#92555 ) This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-13 14:26:18 +01:00
Florian Hahn	c46a6e6c92	[LV] Remove unnecessary getRuntimeVF call when computing vector TC. As Step is VF * UF, there is no need to compute it again, which may require multiple instructions for scalable VFs.	2024-06-12 14:35:37 +01:00
Florian Hahn	f38d84ce32	[VPlan] Use ir-bb prefix for VPIRBasicBlock. Follow-up to adjust the names and tests after https://github.com/llvm/llvm-project/pull/93398.	2024-05-30 17:43:40 -07:00
Ramkumar Ramachandra	43100766f2	LV: generalize profitability criterion over TC (#93300 ) Generalize LoopVectorizationPlanner::isMoreProfitable smoothly across the fixed-vector and scalable-vector cases, taking the trip-count into account, and fixing logical pitfalls that arise from a lack of generality.	2024-05-30 10:54:32 +01:00
Shih-Po Hung	0338c55ea5	[LV, VPlan] Check if plan is compatible to EVL transform (#92092 ) The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.	2024-05-25 08:22:49 +08:00
Ramkumar Ramachandra	bb0d29a72d	[LV] fix logical error in trunc cost (#91136 ) In LoopVectorizationCostModel::getInstructionCost(), when the condition canTruncateToMinimalBitwidth() is satisfied, for a trunc, the source type is computed as the smallest type of the source vector and the destination vector, and the destination type is computed as the largest type of the instruction and destination type. This is clearly a logical error, as the original source vector type could be smaller than the original destination vector type, and the trunc semantics are broken because we're attempting to widen. Fixes #47665.	2024-05-24 18:01:58 +01:00
Shih-Po Hung	b008a2d12a	[LV][NFC] precommit test for EVL transform (#92203 ) A precommit test case to show vector loops generated from EVL transform - This is a precommit test for https://github.com/llvm/llvm-project/pull/92092	2024-05-24 23:21:59 +08:00
Ramkumar Ramachandra	dc148c9fb8	[LV] add test for #47665 , #88802 (#91135 )	2024-05-24 10:50:43 +01:00
Florian Hahn	b050048d35	[VPlan] Simplify (X && Y) \|\| (X && !Y) -> X. (#89386 ) Simplify a common pattern generated for masks when folding the tail. PR: https://github.com/llvm/llvm-project/pull/89386	2024-05-19 15:45:23 +00:00
Craig Topper	487b43cdc9	[RISCV] Pass subvector type to isLegalInterleavedAccessType in getInterleavedMemoryOpCost. (#91825 ) isLegalInterleavedAccessType expects the subvector type, but getInterleavedMemoryOpCost is called with the full vector type. So we need to divide by Factor.	2024-05-15 21:47:29 -07:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Mel Chen	3f1fef3699	[RISCV] Support interleaved accesses for scalable vector. (#90583 ) The support for interleaved accesses for scalable vector with a factor of 2 is enabled in vectorizer. Therefore, the patch removed the restriction for scalable vector with a factor of 2.	2024-05-03 21:56:31 +08:00
Florian Hahn	bccb7ed8ac	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit c6e01627acf859. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in bce3bfced5fe0b019 and an assertion has been added in c7209cbb8be7a3c65813. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-05-03 14:40:49 +01:00

1 2 3 4 5

214 Commits