llvm-project

Author	SHA1	Message	Date
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Mel Chen	3f1fef3699	[RISCV] Support interleaved accesses for scalable vector. (#90583 ) The support for interleaved accesses for scalable vector with a factor of 2 is enabled in vectorizer. Therefore, the patch removed the restriction for scalable vector with a factor of 2.	2024-05-03 21:56:31 +08:00
Florian Hahn	bccb7ed8ac	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit c6e01627acf859. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in bce3bfced5fe0b019 and an assertion has been added in c7209cbb8be7a3c65813. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-05-03 14:40:49 +01:00
Alexey Bataev	1d43cdc9f5	[LV][EVL]Support reversed loads/stores. Support for predicated vector reverse intrinsic was added some time ago. Adds support for predicated reversed loads/stores in the loop vectorizer. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/88025	2024-05-03 07:28:56 -04:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Florian Hahn	e2a72fa583	[VPlan] Introduce recipes for VP loads and stores. (#87816 ) Introduce new subclasses of VPWidenMemoryRecipe for VP (vector-predicated) loads and stores to address multiple TODOs from https://github.com/llvm/llvm-project/pull/76172 Note that the introduction of the new recipes also improves code-gen for VP gather/scatters by removing the redundant header mask. With the new approach, it is not sufficient to look at users of the widened canonical IV to find all uses of the header mask. In some cases, a widened IV is used instead of separately widening the canonical IV. To handle that, first collect all VPValues representing header masks (by looking at users of both the canonical IV and widened inductions that are canonical) and then checking all users (recursively) of those header masks. Depends on https://github.com/llvm/llvm-project/pull/87411. PR: https://github.com/llvm/llvm-project/pull/87816	2024-04-19 09:44:23 +01:00
Arthur Eubanks	c6e01627ac	Revert "Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )"" This reverts commit c6e38b928c56f562aea68a8e90f02dbdf0eada85. Causes miscompiles, see comments on #78304.	2024-04-16 20:40:21 +00:00
Shih-Po Hung	f3a8112d98	[RISCV][TTI] Scale the cost of ICmp with LMUL (#88235 ) Use the Val type to estimate the instruction cost for ICmp.	2024-04-16 09:37:32 +08:00
Florian Hahn	c836983671	[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770 ) VPBlendRecipe does not use the first mask operand. Removing it allows VPlan-based DCE to remove unused mask computations. This also fixes #87410, where unused Not VPInstructions are considered having only their first lane demanded, but some of their operands providing a vector value due to other users. Fixes https://github.com/llvm/llvm-project/issues/87410 PR: https://github.com/llvm/llvm-project/pull/87770	2024-04-09 11:14:05 +01:00
Florian Hahn	c6e38b928c	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit 589c7abb03448. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in 399ff08e29d. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-04-05 13:45:13 +01:00
Alexey Bataev	413a66f339	[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172 ) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750	2024-04-04 18:30:17 -04:00
Florian Hahn	89271b4676	[LV] Add test depending on target to RISCV subdirectory.	2024-04-02 22:02:25 +01:00
Kirill Stoimenov	589c7abb03	Revert "[LV] Improve AnyOf reduction codegen. (#78304 )" Broke sanitizer bots: https://lab.llvm.org/buildbot/#/builders/74/builds/26697 This reverts commit 95fef1dfefd5467206e74c089d29806fcd82889b.	2024-03-14 14:57:01 +00:00
Florian Hahn	95fef1dfef	[LV] Improve AnyOf reduction codegen. (#78304 ) Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-03-14 11:22:06 +00:00
Shih-Po Hung	6ee9c8afbc	[RISCV][CostModel] Updates reduction and shuffle cost (#77342 ) - Make `andi` cost 1 in SK_Broadcast - Query the cost of VID_V, VRSUB_VX/VRSUB_VI which would scale with LMUL	2024-02-29 15:41:19 +08:00
Florian Hahn	15d9d0fa8f	[VPlan] Also print final VPlan directly before codegen/execute. (#82269 ) Some optimizations are apply after UF and VF have been chosen. This patch adds an extra print of the final VPlan just before codegen/execution. In the future, there will be additional transforms that are applied later (interleaving for example). PR: https://github.com/llvm/llvm-project/pull/82269	2024-02-28 13:19:43 +00:00
Philip Reames	f67ef1a8d9	[RISCV][LV] Add additional small trip count loop coverage	2024-02-22 08:30:25 -08:00
Philip Reames	9eb5f94f9b	[RISCV][AArch64] Add vscale_range attribute to tests per architecture minimums Spent a bunch of time tracing down an odd issue "in SCEV" which turned out to be the fact that SCEV doesn't have access to TTI. As a result, the only way for it to get range facts on vscales (to avoid collapsing ranges of element counts and type sizes to trivial ranges on multiplies) is to look at the vscale_range attribute. Since vscale_range is set by clang by default, manually setting it in the tests shouldn't interfere with the test intent.	2024-02-22 08:11:24 -08:00
Philip Reames	1aafe7605b	[test] Regen a test for naming changes	2024-02-06 18:06:24 -08:00
Florian Hahn	51afb10174	[LV] Create block in mask up-front if needed. (#76635 ) At the moment, block and edge masks are created on demand, which means that they are inserted at the point where they are demanded and then cached. It is possible that the mask for a block is looked up later at a point that's not dominated by the point where the mask has been inserted. To avoid this, create masks up front on entry to the corresponding basic block and leave it to VPlan simplification to remove unneeded masks. Note that we need to create masks for all blocks, if any of the blocks in the loop needs predication, as computing the mask of a block depends on the masks of its predecessor. Needed for #76090. https://github.com/llvm/llvm-project/pull/76635	2024-01-09 10:50:08 +00:00
Florian Hahn	f18536d642	[VPlan] Model address separately. (#72164 ) Move vector pointer generation to a separate VPVectorPointerRecipe. This untangles address computation from the memory recipes future and is also needed to enable explicit unrolling in VPlan. https://github.com/llvm/llvm-project/pull/72164	2024-01-01 19:51:15 +00:00
Florian Hahn	a5891fa4d2	[VPlan] Initial modeling of VF * UF as VPValue. (#74761 ) This patch starts initial modeling of VF * UF in VPlan. Initially, introduce a dedicated VFxUF VPValue, which is then populated during VPlan::prepareToExecute. Initially, the VF * UF applies only to the main vector loop region. Once we extend the scope of VPlan in the future, we may want to associate different VFxUFs with different vector loop regions (e.g. the epilogue vector loop) This allows explicitly parameterizing recipes that rely on the VF * UF, like the canonical induction increment. At the moment, this mainly helps to avoid generating some duplicated calls to vscale with scalable vectors. It should also allow using EVL as induction increments explicitly in D99750. Referring to VF * UF is also needed in other places that we plan to migrate to VPlan, like the minimum trip count check during skeleton creation. The first version creates the value for VF * UF directly in prepareToExecute to limit the scope of the patch. A follow-on patch will model VF * UF computation explicitly in VPlan using recipes. Moved from Phabricator (https://reviews.llvm.org/D157322)	2023-12-08 18:30:30 +00:00
Florian Hahn	5ea6a3fc6d	[VPlan] Compute scalable VF in preheader for induction increment. (#74762 ) UF * VF is loop invariant and can be computed directly in the preheader. This prepares the code for #74761 and reduces the test changes.	2023-12-08 12:18:31 +00:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Florian Hahn	38f8b7cbe4	[LV] Replace value numbers with patterns in tests (NFC). Replace some hardcoded value numbers in CHECK-LINES to use patterns, to make the tests more robust wrt renumbering.	2023-10-16 19:53:44 +01:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Rin	d3e4702c0f	[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543 ) Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.	2023-10-05 10:24:30 +01:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alex Richardson	e86d6a43f0	Regenerate test checks for tests affected by D141060	2023-10-04 10:51:35 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00
Alexey Bataev	1129dec778	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.	2023-10-03 13:02:16 -07:00
Alexey Bataev	6f43d28f34	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-03 10:26:11 -07:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00
Sergey Kachkov	0a5d52a757	[RISCV][CostModel] Add getCFInstrCost RISC-V implementation (#65599 ) This patch implements getCFInstrCost TTI hook that mostly affects LoopVectorizer decisions. It sets zero cost for PHI nodes and zero throughput cost for branches (assuming that branches are likely to be predicted). The implementation is similar to X86/AArch64/PowerPC targets and reduces loop cost by excluding induction PHIs/loop latch branches, which in turn leads to selecting smaller vectorization factor.	2023-09-25 12:26:01 +03:00
Florian Hahn	f108c6cdc1	[VPlan] Fold (MUL A, 1) -> A as VPlan2VPlan transform. Add first VPlan-based recipe simplification to fold (MUL A, 1) -> A. Among other things, this enables additional simplifications after applying versioned strides, as follow up to D147783. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D159200	2023-09-18 21:45:34 +01:00
Florian Hahn	96e83d3705	[LV] Use IRBuilder to create and optimize middle-block compare. Split off from D150398 to avoid builder-related diff changes there. Using IRBuilder to create ICmps simplifies the result if both operands are constants. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158332	2023-08-29 11:42:18 +01:00
Florian Hahn	af635a5547	[VPlan] Model wrap flags directly, remove NUW opcodes (NFC) Model wrap flags directly using VPRecipeWithIRFlags and clean up the duplicated NUW opcodes. D157144 will build on this and also model FMFs for VPInstruction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157194	2023-08-08 12:12:30 +01:00
Florian Hahn	93c5bae00e	[VPlan] Use printOperands for VPInstruction. Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.	2023-08-08 11:31:21 +01:00
Florian Hahn	707359ecf5	Recommit "[LV] Re-use existing broadcast value for live-ins." This reverts commit 245ec675a4e41f7ec24dfc998720bffdc46a6c53. Recommits eea9258648ce with a fix to only erase the instruction from the first part if it is defined outside the loop. This fixes a use-after-free error reported.	2023-08-01 15:54:02 +01:00
Martin Storsjö	245ec675a4	Revert "[LV] Re-use existing broadcast value for live-ins." This reverts commit eea9258648ce73507f6f85c395de978af659d498. That commit triggered crashes in the following testcase: $ cat reduced.c typedef struct { int a[8] } b; typedef struct { b c; short d } e; void f() { int g; char h; e i = f; short j = i->d; int a = i->c->a[0]; for (;;) for (; g < a; g++) { h = j * i->d >> 8; h++; } } $ clang -target aarch64-linux-gnu -w -c -O2 reduced.c	2023-07-25 10:35:41 +03:00
Florian Hahn	eea9258648	[LV] Re-use existing broadcast value for live-ins. When requesting a vector value for a live-in, we can re-use the broadcast of the live-in of part 0 for parts > 0.	2023-07-24 11:50:47 +01:00
Philip Reames	7cc6b80d9a	[RISCV][CostModel] Model vrgather.vv as being quadradic in LMUL vrgather.vv across multiple vector registers (i.e. LMUL > 1) requires all to all data movement. This includes two conceptual sets of changes: For permutes, we were modeling these as being linear in LMUL. For reverse, we were modeling them as being fixed cost in LMUL. Both were wrong, and have been adjusted to O(LMUL^2). Noticed via code inspection while looking at something else. Its worth asking whether we should be lowering reverse to something other than a vrgather at high LMULs. That shuffle is quite expensive. (Future work) Differential Revision: https://reviews.llvm.org/D152019	2023-07-18 11:52:34 -07:00
Luke Lau	b9af086292	[RISCV] Update loop vectorizer interleaved access test output 02bb33c3ce7a83d47244ae16c8b4c625aba187a2 changed it so it no longer unrolls the loop.	2023-07-07 15:38:04 +01:00
Florian Hahn	d209084720	[VPlan] Replace versioned stride with constant during VPlan opts. After constructing the initial VPlan, replace VPValues for versioned strides with their constant counterparts. Differential Revision: https://reviews.llvm.org/D147783	2023-06-13 08:26:55 +01:00
Nikita Popov	9cf67f6ea0	[LoopVectorize] Convert most tests to opaque pointers (NFC) The unsized-pointee-crash.ll and zero-sized-pointee-crash.ll tests have been removed, because these issues are not relevant for opaque pointers.	2023-06-12 13:10:22 +02:00
Florian Hahn	299f0ff60e	[VPlan] Print IR flags for VPRecipeWithIRFlags. Now that IR flags are modeled as part of VPRecipeWithIRFlags, include the flags when printing recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150029	2023-05-23 20:36:16 +01:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00

1 2 3 4

167 Commits