llvm-project

Author	SHA1	Message	Date
Nikita Popov	d9dcd52d16	[LoopVectorize] Convert test to opaque pointers (NFC)	2023-04-25 15:28:01 +02:00
David Green	1869a9c225	[LV] Use the known trip count when costing non-tail folded VFs Now that we store the ScalarCost in the VectorizationFactor it is possible to use it to get a slightly more accurate cost in isMoreProfitable between two vector factors. This extends the logic added in D101726 to non-tail-folded cases, using the costs of `VecCost * (TripCount / VF) + ScalarCost * (TripCount % VF)` to compare VFs where the TripCount is known and we are not folding the tail. This shouldn't alter very much as small trip counts are usually not vectorized, but does seem to help in the testcase where 4 * VF4 is chosen as profitable compared to 2 * VF8 + 4 * scalar. Differential Revision: https://reviews.llvm.org/D147720	2023-04-24 22:02:30 +01:00
Jay Foad	593e25ffae	[Vectorize] Fix vectorization, scalarization and folding of llvm.is.fpclass llvm.is.fpclass is different from other vectorizable intrinsics in that it is overloaded on an argument type, not on the return type. Differential Revision: https://reviews.llvm.org/D148905	2023-04-24 13:42:08 +01:00
Jay Foad	3237497d01	[Vectorize] Pre-commit tests for D148905 Differential Revision: https://reviews.llvm.org/D149050	2023-04-24 13:42:08 +01:00
Nikita Popov	d003c01c30	[LV][IndVars] Move test to correct directory and regenerate (NFC) For some reason, an IndVarSimplify test was in the LoopVectorize directory.	2023-04-21 18:03:41 +02:00
Florian Hahn	6b8d19d2b5	Recommit "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5. The new version of the patch adds a set to avoid duplicating work in isFixedOrderRecurrence, which was previously done through the removed SinkAfter map. Original commit message: Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-20 09:31:16 +01:00
Nikita Popov	2ec1d0f427	[InstCombine] Don't reassociate GEPs for loop invariance Since D146813, LICM will reassociate GEPs to expose hoisting opportunities itself. Don't perform this transform in InstCombine, where it is fragile because it depends on an optional LoopInfo analysis.	2023-04-18 12:17:07 +02:00
Nikita Popov	39ca70392b	[LV] Regenerate test checks (NFC)	2023-04-18 11:52:36 +02:00
Graham Hunter	d8c49d2ac9	[LV][AArch64] Autogenerate checks for scalable-strict-fadd.ll (NFC) Precommit for D145163.	2023-04-18 10:25:05 +01:00
Manoj Gupta	3d8ed8b519	Revert "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts commit 7fc0b3049df532fce726d1ff6869a9f6e3183780. Causes a clang hang when building xz utils, github issue #62187.	2023-04-17 12:19:36 -07:00
Florian Hahn	9679b63500	[LV] Precommit test for D147963. Reduced test case for #58811.	2023-04-17 16:19:13 +01:00
Florian Hahn	f555fd5d83	[LV] Regenreate check lines fr pr33706.ll This avoids conflicts when regenerating check lines.	2023-04-17 13:49:49 +01:00
Florian Hahn	b0da998494	[LV] Extend recurrence test coverage for sinking memory instructions. Extra coverage for D143604, D143605.	2023-04-17 13:08:15 +01:00
Florian Hahn	02369b75fd	[VPlan] Mark recurrence recipes as not having side-effects. Add support for FirstOrderRecurrenceSplice and VPFirstOrderRecurrencePHI recipes to mayHaveSideEffects. They both don't have side-effects.	2023-04-17 12:30:52 +01:00
Florian Hahn	a0d667b89b	[LV] Add users to recurrence tests to make sure they are not removable. This ensures VPlan-based DCE won't be able to remove the unused recurrences. It also adds a dedicated new test (@unused_recurrence) where an unused recurrence can be removed.	2023-04-17 11:56:56 +01:00
David Sherwood	69ee653313	[LoopVectorize] Take vscale into account when deciding to create epilogues In LoopVectorizationCostModel::isEpilogueVectorizationProfitable we check to see if the chosen main vector loop VF >= 16. If so, we decide to create a vector epilogue loop. However, this doesn't take VScaleForTuning into account because we could be targeting a CPU where vscale > 1, and hence the runtime VF would be a multiple of the known minimum value. This patch multiplies scalable VFs by VScaleForTuning and several tests have been updated that now produce vector epilogues. Differential Revision: https://reviews.llvm.org/D147522	2023-04-17 10:49:40 +00:00
Florian Hahn	83ab5708d1	[LV] Don't sink scalar instructions that may read from memory. The current sinking code doesn't prevent us from sinking a load past an aliasing store. Skip sinking instructions that may read from memory to avoid a mis-compile. See @minimal_bit_widths_with_aliasing_store for an example where 2 loads are sunk past aliasing stores before this fix. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147259	2023-04-17 09:30:25 +01:00
Florian Hahn	7fc0b3049d	[VPlan] Switch to checking sinking legality for recurrences in VPlan. Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-13 22:00:52 +01:00
Florian Hahn	cdb48eefd0	[LV] Add exit users for recurrences in test, fix names. Add users of exit values for recurrences to make sure exit value generation will be checked in a follow-up change. Also adjusts/fixes naming in the test.	2023-04-13 20:23:56 +01:00
Simon Pilgrim	aa754f7e0f	[IR] llvm::createMinMaxOp - create integer min/max intrinsics instead of icmp/sel Based off D148215, when expanding a min/max reduction we should be creating min/max intrinsics directly instead of relying on instcombine to fold them back together. This patch handles integer min/max cases. Hopefully we can add floating point support soon (at least for fastmath/nnan cases) - but we're missing some of the plumbing to pass the correct FMF to the intrinsic at the moment. Differential Revision: https://reviews.llvm.org/D148221	2023-04-13 16:40:43 +01:00
Craig Topper	4b47d875a1	[LV] Optimize trip count SCEV. To calculate the trip count we need to add 1 to the backedge taken count. If we need to widen the backedge count, it's better to do the add before the widening if we can guarantee it won't overflow. The code here is based on similar code I found in LoopIdiomRecognize. This is the vectorizer version of this InstCombine patch D142783. Looking at the IR diffs, this does look like it gets more cases than the InstCombine patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D147355	2023-04-12 16:17:58 -07:00
Sjoerd Meijer	d827865e9f	Recommit "[AArch64][TTI] Cost model FADD/FSUB/FNEG"" Fixed two test cases that relied on Asserts, and added a fallthrough annotation to the switch case.	2023-04-11 12:48:15 +01:00
Florian Hahn	f9d0b35d22	[LV] Re-use already computed runtime VF in fixFixedOrderRecurrence. This was suggested as independent cleanup in D147472. This removes a redundant runtime VF computation when using scalable vectors.	2023-04-10 21:25:12 +01:00
Florian Hahn	35af27c30a	[VPlan] Only create extracts for recurrence exits if there are live-outs. Move the code to collect live-out earlier and only generate extracts for exit values if there are any live-outs that use them. Depends on D147472. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147567	2023-04-10 21:08:34 +01:00
Florian Hahn	b0e118bd77	[LV] Update tests checking VPlans to use patterns for VPValues. This makes the tests more robust to changes in value numbering for VPValues.	2023-04-09 20:32:09 +01:00
Florian Hahn	620e011a25	[VPlan] Don't add live-outs if scalar epilogue is required. Instead of clearing live outs when a scalar epilogue is required late, don't add live outs during VPlan construction if a scalar epilogue is required. This enables more VPlan-based DCE (if the live out would be the only user in the plan) and is a step towards removing an access of the cost model in fixedVectorizedLoop (which is after VPlan execution). Depends on D147468. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147471	2023-04-09 09:18:24 +01:00
David Sherwood	9278dd7b2b	[LoopVectorize] Fix zext/sext cost calculations when types are shrunk In getInstructionCost if we know a zext/sext is going to be shrunk we should only be changing the destination type, and leave the source type unchanged. For example, we may change a zext from zext <16 x i8> %a to <16 x i32> to zext <16 x i8> %a to <16 x i16> However, we were previously calculating the cost for doing zext <16 x i16> %a to <16 x i16> which is incorrect. Differential Revision: https://reviews.llvm.org/D147152	2023-04-06 08:52:25 +00:00
Nikita Popov	53f7f85703	[LoopVectorize] Convert some tests to opaque pointers (NFC)	2023-04-06 09:38:47 +02:00
Philip Reames	c416f6700f	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2) (JFYI - This has been heavily reframed since original attempt at landing.) This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior. In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach). This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues. Differential Revision: https://reviews.llvm.org/D147336	2023-04-05 09:32:35 -07:00
Philip Reames	37646a2c28	[RISCV] Account for LMUL in memory op costs Generally, the cost of a memory op will scale with the number of vector registers accessed. Machines might exist which have a narrow memory access than vector register width, but machines with a wider memory access width than vector register width seem unlikely. I noticed this because we were preferring wide loads + deinterleaves on examples where the cost of a short gather (actually a strided load) would be better. Touching 8 vector registers instead of doing a 4 element gather is not a good tradeoff. Differential Revision: https://reviews.llvm.org/D147470	2023-04-05 07:58:56 -07:00
Graham Hunter	185863f7de	[LV] Use available masked vector function variants when required LLVM has the ability to vectorize using function variants that require a mask by creating an all-true mask, and to vectorize a conditional call via scalarization, now we want to join the two parts together and use a masked variant when a mask is required. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D136251	2023-04-05 11:18:38 +01:00
Florian Hahn	b5925a22c9	[LV] Add uses of recurrences in exit blocks in some tests. This preserves the spirit of the tests even if a follow-up changes only generates exit values for recurrences if they are actually used.	2023-04-04 21:19:29 +01:00
Luke Lau	971a4501f7	[RISCV] Model vlseg/vsseg in interleaved memory ops If the legalized type is a legal interleaved access type (i.e. there's a supported vlseg/vsseg instruction for it), the interleaved access pass will pick any interleaved memory op (wide load + shuffles) and lower it into a vlseg/vsseg intrinsic. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D146522	2023-04-04 15:05:14 +01:00
David Sherwood	af2ed59f2b	[NFC][LoopVectorize] Add zext/sext cost tests when there is type shrinkage Differential Revision: https://reviews.llvm.org/D147151	2023-04-03 13:12:11 +00:00
Florian Hahn	0d61ffd350	[Loads] Support SCEVAddExpr as start for pointer AddRec. Extend handling to support `%base + offset` as start for AddRecs in isDereferenceableAndAlignedInLoop. This is done by adjusting AccessSize by the offset and effectively checking if the full object starting from %base to %base + offset + access-size is dereferenceable. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D147260	2023-04-02 12:33:44 +01:00
Florian Hahn	fdaebbeff4	[LV] Improve test added in 74dee4791a2. Adjust test so it triggers a case missed in the original version of D147260.	2023-03-31 21:50:39 +01:00
Florian Hahn	74dee4791a	[LV] Add test with predicated load where EltSize % Align != 0.	2023-03-31 21:15:24 +01:00
Philip Reames	78ae870f11	{tests] Rerun autogen to reduce a diff [nfc]	2023-03-31 12:47:08 -07:00
Philip Reames	a512ce5e12	[LV] Add tests for non-constant stride pointer inductions Reduced from the case which triggered the revert of 498aa534f472, and then generalized to cover both expansion paths.	2023-03-31 09:10:59 -07:00
David Green	965a090f02	Revert "[IVDescriptors] Add pointer InductionDescriptors with non-constant strides" Multiple errors have being reported on https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312 Reverting until the correctness issues can be resolved. We are also seeing a lot of performance differences from the patch. Some are looking good, but some are looking pretty bad.	2023-03-31 11:08:50 +01:00
Florian Hahn	b060ca7042	[LV] Regenerate check lines for test to reduce diff in follow-up patch.	2023-03-30 20:17:12 +01:00
Philip Reames	498aa534f4	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides This matches the handling for integer IVs. I left the non-opaque cases alone, mostly because they're largely irrelevant today. This doesn't actually make much difference in vectorization right now as we immediately fail on aliasing checks (which also bail on non-constant strides). Slightly suprisingly, it's the case which do need runtime checks which work after this patch as they don't use the same dependency analysis path. This will also enable non-constant stride pointer recurrences for other consumers. I've auditted said code, and don't see any obvious issues.	2023-03-30 11:56:00 -07:00
Philip Reames	1c5bb25d62	[RISCV][LV] Add test coverage for strided access patterns [nfc]	2023-03-30 10:02:27 -07:00
Florian Hahn	4173ed1382	[LV] Add test cases for global struct dereferencability. Currently LLVM fails to determine that conditional loads in @accesses_to_struct_dereferenceable are dereferenceable unconditionally.	2023-03-29 17:47:41 +01:00
Paul Osmialowski	6b6f312cce	[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions This commit extends D134719 "[AArch64] Enable libm vectorized functions via SLEEF" with the mappings for the scalable functions. It also introduces all the necessary changes needed to support masked interfaces. Reviewed By: danielkiss, sdesmalen Differential Revision: https://reviews.llvm.org/D146839	2023-03-29 13:07:09 +01:00
Paul Osmialowski	f8f1909d36	Revert "[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions" Reverting it so I could land it with Arcanist. This reverts commit 59dcf927ee43e995374907b6846b657f68d7ea49.	2023-03-29 12:54:22 +01:00
Paul Osmialowski	59dcf927ee	[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions This commit extends D134719 "[AArch64] Enable libm vectorized functions via SLEEF" with the mappings for the scalable functions. It also introduces all the necessary changes needed to support masked interfaces. Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>	2023-03-29 11:07:35 +01:00
Graham Hunter	fba2a7c695	[LV][AArch64] Precommit interleaved access tests Precommit for D145163	2023-03-29 10:26:14 +01:00
Philip Reames	64f69e453e	[RISCV] Cost model for general case of single vector permute The cost model was not accounting for the fact that we can generate vrgather + an index expression. Two cases to call out. 1) I did not model the difference between vrgather and vrgatherei16. The result is the constant pool cost can be slightly understated on RV32. I don't think we care, but if someone disagrees, this would be easy to add. 2) Our current codegen for i8 vectors longer than 256 (which is the limit of what this costs) has some room for improvement. Differential Revision: https://reviews.llvm.org/D147000	2023-03-28 07:34:11 -07:00
David Sherwood	636efd2e35	[SVE][LoopVectorize] Add option to disable tail-folding for reverse loops If we use tail-folding for reverse loops that contain loads and stores then we will need to reverse the loop predicate. This patch adds a new 'reverse' sve-tail-folding option and ensures they are not considered 'simple'. I did this by adding a function called containsDecreasingPointers to AArch64TargetTransformInfo.cpp that searches all instructions in the loop for loads or stores with negative strides. Differential Revision: https://reviews.llvm.org/D146128	2023-03-27 14:10:15 +00:00

... 6 7 8 9 10 ...

2388 Commits