llvm-project

Author	SHA1	Message	Date
Florian Hahn	b841e2eca3	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92. A number of crashes have been fixed by separate fixes, including ttps://github.com/llvm/llvm-project/pull/96622. This version of the PR also pre-computes the costs for branches (except the latch) instead of computing their costs as part of costing of replicate regions, as there may not be a direct correspondence between original branches and number of replicate regions. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-07-10 14:22:21 +01:00
Florian Hahn	0577cdaa32	[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612 ) Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps. Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected. PR: https://github.com/llvm/llvm-project/pull/77612	2024-07-08 16:34:42 +01:00
Florian Hahn	99d6c6d936	[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651 ) This patch moves branch condition creation to enter the scalar epilogue loop to VPlan. Modeling the branch in the middle block also requires modeling the successor blocks. This is done using the recently introduced VPIRBasicBlock. Note that the middle.block is still created as part of the skeleton and then patched in during VPlan execution. Unfortunately the skeleton needs to create the middle.block early on, as it is also used for induction resume value creation and is also needed to properly update the dominator tree during skeleton creation. After this patch lands, I plan to move induction resume value and phi node creation in the scalar preheader to VPlan. Once that is done, we should be able to create the middle.block in VPlan directly. This is a re-worked version based on the earlier https://reviews.llvm.org/D150398 and the main change is the use of VPIRBasicBlock. Depends on https://github.com/llvm/llvm-project/pull/92525 PR: https://github.com/llvm/llvm-project/pull/92651	2024-07-05 10:08:42 +01:00
Florian Hahn	2b3b405b09	[LV] Don't vectorize first-order recurrence with VF <vscale x 1 x ..> The assertion added as part of https://github.com/llvm/llvm-project/pull/93395 surfaced cases where first-order recurrences are vectorized with <vscale x 1 x ..>. If vscale is 1, then we are unable to extract the penultimate value (second to last lane). Previously this case got mis-compiled, trying to extract from an invalid lane (-1) https://llvm.godbolt.org/z/3adzYYcf9. Fixes https://github.com/llvm/llvm-project/issues/97452.	2024-07-04 11:44:51 +01:00
Kolya Panchenko	49e5cd2acc	[LV][NFC] Marked functions as const. Added LLVM_DEBUG. (#96681 )	2024-06-26 17:38:18 -04:00
Florian Hahn	8681bb8bed	[LV] Add additional test coverage for cost modeling. Add missing tests uncovered by https://github.com/llvm/llvm-project/pull/92555. Includes test for https://github.com/llvm/llvm-project/issues/96294 and https://github.com/llvm/llvm-project/issues/96328	2024-06-26 10:18:01 +01:00
Florian Hahn	abf5969f76	[VPlan] Don't compute costs if there are no vector VPlans. In some cases, no vector VPlans can be constructed due to failing VPlan legality checks (e.g. unable to perform sinking for first order recurrences or plans being incompatible with EVL). There's no need to compute costs in those cases, so check directly if there are no vector plans.	2024-06-24 08:38:31 +01:00
Florian Hahn	f0c674f680	[LV] Add test showing cost is computed when there are no vector plans. Add test showing unnecessary cost computations, as no vector VPlans are generated.	2024-06-24 08:08:56 +01:00
Florian Hahn	f1f3c34b47	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 242cc200ccb24e22eaf54aed7b0b0c84cfc54c0b and eea150c84053035163f307b46549a2997a343ce9, as it is causing a build bot failure and there have been a number of crashes reported at https://github.com/llvm/llvm-project/pull/92555	2024-06-21 19:54:21 +01:00
Florian Hahn	242cc200cc	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92. Extra tests for crashes discovered when building Chromium have been added in fb86cb7ec157689e, 3be7312f81ad2. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-20 17:32:52 +01:00
Florian Hahn	3808ba78de	[VPlan] Model middle block via VPIRBasicBlock. (#95816 ) Use VPIRBasicBlock to wrap the middle block and implement patching up branches in predecessors in VPIRBasicBlock::execute. The IR middle block is only created after skeleton creation. Initially a regular VPBasicBlock is created, which will later be replaced by a VPIRBasicBlock once the middle IR basic block has been created. Note that this slightly changes the order of instructions created in the middle block; code generated by recipe execution in the middle block will now be inserted before the terminator (and in between the compare to used by the terminator). The original order will be restored in https://github.com/llvm/llvm-project/pull/92651. PR: https://github.com/llvm/llvm-project/pull/95816	2024-06-20 13:42:20 +01:00
Arthur Eubanks	6f538f6a2d	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264. This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-14 17:47:08 +00:00
Florian Hahn	90fd99c079	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c. Extra tests have been added in 52d29eb287. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-14 12:33:48 +01:00
Jay Foad	d4a0154902	[llvm-project] Fix typo "seperate" (#95373 )	2024-06-13 20:20:27 +01:00
Arthur Eubanks	46080abe9b	Revert "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-13 16:37:21 +00:00
Florian Hahn	00798354c5	[VPlan] First step towards VPlan cost modeling. (#92555 ) This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-13 14:26:18 +01:00
Florian Hahn	c46a6e6c92	[LV] Remove unnecessary getRuntimeVF call when computing vector TC. As Step is VF * UF, there is no need to compute it again, which may require multiple instructions for scalable VFs.	2024-06-12 14:35:37 +01:00
Florian Hahn	f38d84ce32	[VPlan] Use ir-bb prefix for VPIRBasicBlock. Follow-up to adjust the names and tests after https://github.com/llvm/llvm-project/pull/93398.	2024-05-30 17:43:40 -07:00
Ramkumar Ramachandra	43100766f2	LV: generalize profitability criterion over TC (#93300 ) Generalize LoopVectorizationPlanner::isMoreProfitable smoothly across the fixed-vector and scalable-vector cases, taking the trip-count into account, and fixing logical pitfalls that arise from a lack of generality.	2024-05-30 10:54:32 +01:00
Shih-Po Hung	0338c55ea5	[LV, VPlan] Check if plan is compatible to EVL transform (#92092 ) The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.	2024-05-25 08:22:49 +08:00
Ramkumar Ramachandra	bb0d29a72d	[LV] fix logical error in trunc cost (#91136 ) In LoopVectorizationCostModel::getInstructionCost(), when the condition canTruncateToMinimalBitwidth() is satisfied, for a trunc, the source type is computed as the smallest type of the source vector and the destination vector, and the destination type is computed as the largest type of the instruction and destination type. This is clearly a logical error, as the original source vector type could be smaller than the original destination vector type, and the trunc semantics are broken because we're attempting to widen. Fixes #47665.	2024-05-24 18:01:58 +01:00
Shih-Po Hung	b008a2d12a	[LV][NFC] precommit test for EVL transform (#92203 ) A precommit test case to show vector loops generated from EVL transform - This is a precommit test for https://github.com/llvm/llvm-project/pull/92092	2024-05-24 23:21:59 +08:00
Ramkumar Ramachandra	dc148c9fb8	[LV] add test for #47665 , #88802 (#91135 )	2024-05-24 10:50:43 +01:00
Florian Hahn	b050048d35	[VPlan] Simplify (X && Y) \|\| (X && !Y) -> X. (#89386 ) Simplify a common pattern generated for masks when folding the tail. PR: https://github.com/llvm/llvm-project/pull/89386	2024-05-19 15:45:23 +00:00
Craig Topper	487b43cdc9	[RISCV] Pass subvector type to isLegalInterleavedAccessType in getInterleavedMemoryOpCost. (#91825 ) isLegalInterleavedAccessType expects the subvector type, but getInterleavedMemoryOpCost is called with the full vector type. So we need to divide by Factor.	2024-05-15 21:47:29 -07:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Mel Chen	3f1fef3699	[RISCV] Support interleaved accesses for scalable vector. (#90583 ) The support for interleaved accesses for scalable vector with a factor of 2 is enabled in vectorizer. Therefore, the patch removed the restriction for scalable vector with a factor of 2.	2024-05-03 21:56:31 +08:00
Florian Hahn	bccb7ed8ac	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit c6e01627acf859. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in bce3bfced5fe0b019 and an assertion has been added in c7209cbb8be7a3c65813. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-05-03 14:40:49 +01:00
Alexey Bataev	1d43cdc9f5	[LV][EVL]Support reversed loads/stores. Support for predicated vector reverse intrinsic was added some time ago. Adds support for predicated reversed loads/stores in the loop vectorizer. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/88025	2024-05-03 07:28:56 -04:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Florian Hahn	e2a72fa583	[VPlan] Introduce recipes for VP loads and stores. (#87816 ) Introduce new subclasses of VPWidenMemoryRecipe for VP (vector-predicated) loads and stores to address multiple TODOs from https://github.com/llvm/llvm-project/pull/76172 Note that the introduction of the new recipes also improves code-gen for VP gather/scatters by removing the redundant header mask. With the new approach, it is not sufficient to look at users of the widened canonical IV to find all uses of the header mask. In some cases, a widened IV is used instead of separately widening the canonical IV. To handle that, first collect all VPValues representing header masks (by looking at users of both the canonical IV and widened inductions that are canonical) and then checking all users (recursively) of those header masks. Depends on https://github.com/llvm/llvm-project/pull/87411. PR: https://github.com/llvm/llvm-project/pull/87816	2024-04-19 09:44:23 +01:00
Arthur Eubanks	c6e01627ac	Revert "Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )"" This reverts commit c6e38b928c56f562aea68a8e90f02dbdf0eada85. Causes miscompiles, see comments on #78304.	2024-04-16 20:40:21 +00:00
Shih-Po Hung	f3a8112d98	[RISCV][TTI] Scale the cost of ICmp with LMUL (#88235 ) Use the Val type to estimate the instruction cost for ICmp.	2024-04-16 09:37:32 +08:00
Florian Hahn	c836983671	[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770 ) VPBlendRecipe does not use the first mask operand. Removing it allows VPlan-based DCE to remove unused mask computations. This also fixes #87410, where unused Not VPInstructions are considered having only their first lane demanded, but some of their operands providing a vector value due to other users. Fixes https://github.com/llvm/llvm-project/issues/87410 PR: https://github.com/llvm/llvm-project/pull/87770	2024-04-09 11:14:05 +01:00
Florian Hahn	c6e38b928c	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit 589c7abb03448. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in 399ff08e29d. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-04-05 13:45:13 +01:00
Alexey Bataev	413a66f339	[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172 ) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750	2024-04-04 18:30:17 -04:00
Florian Hahn	89271b4676	[LV] Add test depending on target to RISCV subdirectory.	2024-04-02 22:02:25 +01:00
Kirill Stoimenov	589c7abb03	Revert "[LV] Improve AnyOf reduction codegen. (#78304 )" Broke sanitizer bots: https://lab.llvm.org/buildbot/#/builders/74/builds/26697 This reverts commit 95fef1dfefd5467206e74c089d29806fcd82889b.	2024-03-14 14:57:01 +00:00
Florian Hahn	95fef1dfef	[LV] Improve AnyOf reduction codegen. (#78304 ) Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-03-14 11:22:06 +00:00
Shih-Po Hung	6ee9c8afbc	[RISCV][CostModel] Updates reduction and shuffle cost (#77342 ) - Make `andi` cost 1 in SK_Broadcast - Query the cost of VID_V, VRSUB_VX/VRSUB_VI which would scale with LMUL	2024-02-29 15:41:19 +08:00
Florian Hahn	15d9d0fa8f	[VPlan] Also print final VPlan directly before codegen/execute. (#82269 ) Some optimizations are apply after UF and VF have been chosen. This patch adds an extra print of the final VPlan just before codegen/execution. In the future, there will be additional transforms that are applied later (interleaving for example). PR: https://github.com/llvm/llvm-project/pull/82269	2024-02-28 13:19:43 +00:00
Philip Reames	f67ef1a8d9	[RISCV][LV] Add additional small trip count loop coverage	2024-02-22 08:30:25 -08:00
Philip Reames	9eb5f94f9b	[RISCV][AArch64] Add vscale_range attribute to tests per architecture minimums Spent a bunch of time tracing down an odd issue "in SCEV" which turned out to be the fact that SCEV doesn't have access to TTI. As a result, the only way for it to get range facts on vscales (to avoid collapsing ranges of element counts and type sizes to trivial ranges on multiplies) is to look at the vscale_range attribute. Since vscale_range is set by clang by default, manually setting it in the tests shouldn't interfere with the test intent.	2024-02-22 08:11:24 -08:00
Philip Reames	1aafe7605b	[test] Regen a test for naming changes	2024-02-06 18:06:24 -08:00
Florian Hahn	51afb10174	[LV] Create block in mask up-front if needed. (#76635 ) At the moment, block and edge masks are created on demand, which means that they are inserted at the point where they are demanded and then cached. It is possible that the mask for a block is looked up later at a point that's not dominated by the point where the mask has been inserted. To avoid this, create masks up front on entry to the corresponding basic block and leave it to VPlan simplification to remove unneeded masks. Note that we need to create masks for all blocks, if any of the blocks in the loop needs predication, as computing the mask of a block depends on the masks of its predecessor. Needed for #76090. https://github.com/llvm/llvm-project/pull/76635	2024-01-09 10:50:08 +00:00
Florian Hahn	f18536d642	[VPlan] Model address separately. (#72164 ) Move vector pointer generation to a separate VPVectorPointerRecipe. This untangles address computation from the memory recipes future and is also needed to enable explicit unrolling in VPlan. https://github.com/llvm/llvm-project/pull/72164	2024-01-01 19:51:15 +00:00
Florian Hahn	a5891fa4d2	[VPlan] Initial modeling of VF * UF as VPValue. (#74761 ) This patch starts initial modeling of VF * UF in VPlan. Initially, introduce a dedicated VFxUF VPValue, which is then populated during VPlan::prepareToExecute. Initially, the VF * UF applies only to the main vector loop region. Once we extend the scope of VPlan in the future, we may want to associate different VFxUFs with different vector loop regions (e.g. the epilogue vector loop) This allows explicitly parameterizing recipes that rely on the VF * UF, like the canonical induction increment. At the moment, this mainly helps to avoid generating some duplicated calls to vscale with scalable vectors. It should also allow using EVL as induction increments explicitly in D99750. Referring to VF * UF is also needed in other places that we plan to migrate to VPlan, like the minimum trip count check during skeleton creation. The first version creates the value for VF * UF directly in prepareToExecute to limit the scope of the patch. A follow-on patch will model VF * UF computation explicitly in VPlan using recipes. Moved from Phabricator (https://reviews.llvm.org/D157322)	2023-12-08 18:30:30 +00:00
Florian Hahn	5ea6a3fc6d	[VPlan] Compute scalable VF in preheader for induction increment. (#74762 ) UF * VF is loop invariant and can be computed directly in the preheader. This prepares the code for #74761 and reduces the test changes.	2023-12-08 12:18:31 +00:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Florian Hahn	38f8b7cbe4	[LV] Replace value numbers with patterns in tests (NFC). Replace some hardcoded value numbers in CHECK-LINES to use patterns, to make the tests more robust wrt renumbering.	2023-10-16 19:53:44 +01:00

1 2 3 4

192 Commits