llvm-project

Author	SHA1	Message	Date
Florian Hahn	e775efcec4	[LV] Apply loop guards when checking recur during hoisting RT checks. Apply loop guards when checking if the recurrence is non-negative in cases where runtime checks are hoisted out of an inner loop.	2024-06-04 20:37:46 +01:00
Florian Hahn	164597616c	[LV] Add test for RT check hoisting where loop guards simplify check. Add a test case with a missed simplification when hoisting runtime checks due to not applying loop guards.	2024-06-04 09:32:22 +01:00
Ramkumar Ramachandra	59cb55d384	VPlan: add missing case for LogicalAnd; fix crash (#93553 ) VPTypeAnalysis::inferScalarTypeForRecipe is missing the case for VPInstruction::LogicalAnd, due to which the test vplan-incomplete-cases.ll crashes. Add this missing case, and move the test in vplan-infer-not-or-type.ll to vplan-incomplete-cases.ll, showing correct codegen for trip-counts 2 and 3.	2024-06-04 08:58:16 +01:00
Florian Hahn	07b330132c	[VPlan] Model FOR extract of exit value in VPlan. (#93395 ) This patch introduces a new ExtractFromEnd VPInstruction opcode to extract the value of a FOR for users outside the loop (i.e. in the scalar loop's exits). This moves the first part of fixing first order recurrences to VPlan, and removes some additional code to patch up live-outs, which is now handled automatically. The majority of test changes is due to changes in the order of which the extracts are generated now. As we are now using VPTransformState to generate the extracts, we may be able to re-use existing extracts in the loop body in some cases. For scalable vectors, in some cases we now have to compute the runtime VF twice, as each extract is now independent, but those should be trivial to clean up for later passes (and in line with other places in the code that also liberally re-compute runtime VFs). PR: https://github.com/llvm/llvm-project/pull/93395	2024-06-03 20:20:30 +01:00
Florian Hahn	f7e63e8b46	[LV] Operands feeding pointers of interleave member pointers are free. For interleave groups we only create a pointer for the start of the interleave group, not all original loads/stores. Mark single-use ops feeding interleave group mem ops as free when vectorizing.	2024-06-01 13:59:29 +01:00
Florian Hahn	4c6367b3e5	[LV] Add test with strided interleave groups and maximizing bandwidth.	2024-06-01 12:26:00 +01:00
Florian Hahn	f38d84ce32	[VPlan] Use ir-bb prefix for VPIRBasicBlock. Follow-up to adjust the names and tests after https://github.com/llvm/llvm-project/pull/93398.	2024-05-30 17:43:40 -07:00
Ramkumar Ramachandra	43100766f2	LV: generalize profitability criterion over TC (#93300 ) Generalize LoopVectorizationPlanner::isMoreProfitable smoothly across the fixed-vector and scalable-vector cases, taking the trip-count into account, and fixing logical pitfalls that arise from a lack of generality.	2024-05-30 10:54:32 +01:00
Florian Hahn	8b037862b6	[VPlan] Preserve DT (and SCEV) in VPlan-native path (#93287 ) As a follow-up to b2f65e80, use the DTU to also update and preserve the DT in the native path. This should also allow preserving SCEV in the native path PR: https://github.com/llvm/llvm-project/pull/93287	2024-05-27 17:03:53 -07:00
Florian Hahn	bb4c8f9219	[SCEV] Don't add predicates already implied by UnionPredicate. (#93397 ) Update SCEVUnionPredicate::add to only add predicates from another union predicate, if they aren't alread implied by the union predicate we add them to. Note that there exists logic elsewhere to avoid adding predicates if they are already implied, but this logic misses cases when only some predicates of a union predicate are implied by the current set of predicates. PR: https://github.com/llvm/llvm-project/pull/93397	2024-05-26 18:31:36 -07:00
Florian Hahn	686600b521	[LV] Add test showing missed removal of implied predicate. Tests for https://github.com/llvm/llvm-project/pull/93397	2024-05-26 17:23:14 -07:00
Florian Hahn	ac17fbc076	[VPlan] Add test for printing FOR with live-out. Add additional test coverage for printing VPlans with a first-order recurrence with its result used outside the loop.	2024-05-25 21:25:57 -07:00
Shih-Po Hung	0338c55ea5	[LV, VPlan] Check if plan is compatible to EVL transform (#92092 ) The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.	2024-05-25 08:22:49 +08:00
Ramkumar Ramachandra	bb0d29a72d	[LV] fix logical error in trunc cost (#91136 ) In LoopVectorizationCostModel::getInstructionCost(), when the condition canTruncateToMinimalBitwidth() is satisfied, for a trunc, the source type is computed as the smallest type of the source vector and the destination vector, and the destination type is computed as the largest type of the instruction and destination type. This is clearly a logical error, as the original source vector type could be smaller than the original destination vector type, and the trunc semantics are broken because we're attempting to widen. Fixes #47665.	2024-05-24 18:01:58 +01:00
Shih-Po Hung	b008a2d12a	[LV][NFC] precommit test for EVL transform (#92203 ) A precommit test case to show vector loops generated from EVL transform - This is a precommit test for https://github.com/llvm/llvm-project/pull/92092	2024-05-24 23:21:59 +08:00
Ramkumar Ramachandra	dc148c9fb8	[LV] add test for #47665 , #88802 (#91135 )	2024-05-24 10:50:43 +01:00
Freddy Ye	4def1ce101	Reland "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93136 ) This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.	2024-05-24 13:46:34 +08:00
David Green	46541a3636	[ARM] Add a extra MVE low-trip-count loop. NFC This makes use of half floats, which makes the masked stores expensive.	2024-05-23 21:50:47 +01:00
Freddy Ye	aa4069ea96	Revert "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93123 ) This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.	2024-05-23 10:25:23 +08:00
Freddy Ye	282d2ab58f	[X86] Remove knl/knm specific ISAs supports (#92883 ) Cont. patch after https://github.com/llvm/llvm-project/pull/75580	2024-05-23 09:46:44 +08:00
Simon Pilgrim	0873b4ca29	[LoopVectorize] optimal-epilog-vectorization-profitability.ll - fix LABLE -> LABEL typo Typo identified in #91854	2024-05-22 11:07:24 +01:00
Florian Hahn	a56e6dfd2e	[LV] Add test for header mask and invariant compare cost-modeling. Additional test coverage for the VPlan-based cost model work.	2024-05-22 09:57:35 +01:00
Sander de Smalen	1015f51dd9	[AArch64] NFC: Rename -force-streaming-compatible-sve to -force-streaming-compatible (#92774 ) The behaviour of the flag should be equivalent to __arm_streaming_compatible. At the moment, the name suggests that '-force-streaming-compatible-sve' on its own (i.e. without specifying `+sve`) enables the compiler to use the streaming-compatible subset of SVE instructions, but the semantics merely are that the function can be called with either PSTATE.SM=0 or PSTATE.SM=1.	2024-05-22 07:58:54 +01:00
Florian Hahn	352dc7d4bb	[LV] Propagate PredicatedBBsAfterVectorization to predecessors. This fixes some cases where predicated BBs where missed previously, leading to under-estimating the cost of those blocks.	2024-05-21 10:27:32 +01:00
hev	1e86e92428	[LoongArch] Enable interleaved vectorization (#92629 ) This PR enables interleaved vectorization for LoongArch, with a default interleaving factor of `2`.	2024-05-21 15:31:02 +08:00
Florian Hahn	82c5d350d2	[VPlan] Add commutative binary OR matcher, use in transform. (#92539 ) Split off from https://github.com/llvm/llvm-project/pull/89386, this extends the binary matcher to support matching commuative operations. This is used for a new m_c_BinaryOr matcher, used in simplifyRecipe. PR: https://github.com/llvm/llvm-project/pull/92539	2024-05-20 13:03:48 +01:00
Nikita Popov	8e8d2595da	[ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872 ) This patch canonicalizes constant expression GEPs to use i8 source element type, aka ptradd. This is the ConstantFolding equivalent of the InstCombine canonicalization introduced in #68882. I believe all our optimizations working on constant expression GEPs (like GlobalOpt etc) have already been switched to work on offsets, so I don't expect any significant fallout from this change. This is part of: https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699	2024-05-20 11:47:30 +02:00
Florian Hahn	b050048d35	[VPlan] Simplify (X && Y) \|\| (X && !Y) -> X. (#89386 ) Simplify a common pattern generated for masks when folding the tail. PR: https://github.com/llvm/llvm-project/pull/89386	2024-05-19 15:45:23 +00:00
Florian Hahn	1e7d047c71	[VPlan] Mark LoopInfo preserved in native-path as well (NFC). LoopInfo is updated during VPlan execution now, so it will also be updated correctly in the native path.	2024-05-17 12:18:01 +01:00
Craig Topper	487b43cdc9	[RISCV] Pass subvector type to isLegalInterleavedAccessType in getInterleavedMemoryOpCost. (#91825 ) isLegalInterleavedAccessType expects the subvector type, but getInterleavedMemoryOpCost is called with the full vector type. So we need to divide by Factor.	2024-05-15 21:47:29 -07:00
Pietro Ghiglio	83d9aa2768	[VPlan] Add scalar inferencing support for addrspace cast (#92107 ) Fixes https://github.com/llvm/llvm-project/issues/91434 PR: https://github.com/llvm/llvm-project/pull/92107	2024-05-15 14:03:21 +01:00
Florian Hahn	b0a1ae2cca	[LV] Add additional variants of tests with udiv/urem/sdiv/srem in TC. Add additional tests with udiv/urem/sdiv/srem in trip counts, where the divisor is constant. For https://github.com/llvm/llvm-project/pull/92177.	2024-05-15 11:17:23 +01:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Florian Hahn	cf5db39907	[LV] Add tests with trip counts containing UDIV expressions. Add test cases for https://github.com/llvm/llvm-project/issues/89958.	2024-05-14 20:28:27 +01:00
Florian Hahn	67d840b60f	[VPlan] Relax over-aggressive assertion in VPTransformState::get(). There are cases where a vector value has some users that demand the the single scalar value only (NeedsScalar), while other users demand the vector value (see attached test cases). In those cases, the NeedsScalar users should only demand the first lane. Fixes https://github.com/llvm/llvm-project/issues/91883.	2024-05-14 19:10:49 +01:00
Florian Hahn	632317e9ab	[VPlan] Add non-poison propagating LogicalAnd VPInstruction opcode. (#91897 ) Add a new opcode to mode non-poison propagating logical AND operations used when generating edge masks. This follows the similar decision to model Not as dedicated opcode as well, to improve clarity. This also helps to simplify the matchers for https://github.com/llvm/llvm-project/pull/89386. PR: https://github.com/llvm/llvm-project/pull/91897	2024-05-14 09:42:49 +01:00
Fangrui Song	ef9090fcb5	[test] Fix check prefixes	2024-05-13 14:01:00 -07:00
Simon Pilgrim	079fdef7d2	[TTI] getCommonMaskedMemoryOpCost - use the target getMemoryOpCost/getCFInstrCost implementations. We were using the default implementations instead of the CRTP versions.	2024-05-11 12:50:26 +01:00
Florian Hahn	082c81ae4a	[LV] Properly extend versioned constant strides. We only version unknown strides to 1. If the original type is i1, then the sign of the extension matters. Properly extend the stride value before replacing it. Fixes https://github.com/llvm/llvm-project/issues/91369.	2024-05-07 21:31:42 +01:00
Florian Hahn	c76ccf0f1e	[LV] Add test case for #91369 . Add tests for https://github.com/llvm/llvm-project/issues/91369.	2024-05-07 20:41:55 +01:00
Florian Hahn	b54a78d69b	[LV,LAA] Don't vectorize loops with load and store to invar address. Code checking stores to invariant addresses and reductions made an incorrect assumption that the case of both a load & store to the same invariant address does not need to be handled. In some cases when vectorizing with runtime checks, there may be dependences with a load and store to the same address, storing a reduction value. Update LAA to separately track if there was a store-store and a load-store dependence with an invariant addresses. Bail out early if there as a load-store dependence with invariant address. If there was a store-store one, still apply the logic checking if they all store a reduction.	2024-05-04 20:53:54 +01:00
Florian Hahn	401ecb4ccc	[LV] Add test showing miscompile with store reductions and RT checks. Add anew test showing how a loop gets vectorized incorrectly with a invariant store reduction where the same location is also read, when vectorizing with runtime checks.	2024-05-03 18:54:00 +01:00
Mel Chen	3f1fef3699	[RISCV] Support interleaved accesses for scalable vector. (#90583 ) The support for interleaved accesses for scalable vector with a factor of 2 is enabled in vectorizer. Therefore, the patch removed the restriction for scalable vector with a factor of 2.	2024-05-03 21:56:31 +08:00
Florian Hahn	bccb7ed8ac	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit c6e01627acf859. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in bce3bfced5fe0b019 and an assertion has been added in c7209cbb8be7a3c65813. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-05-03 14:40:49 +01:00
Alexey Bataev	1d43cdc9f5	[LV][EVL]Support reversed loads/stores. Support for predicated vector reverse intrinsic was added some time ago. Adds support for predicated reversed loads/stores in the loop vectorizer. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/88025	2024-05-03 07:28:56 -04:00
Florian Hahn	bce3bfced5	[LV] Add another epilogue test with an AnyOfReduction of i1. Additional test case from https://github.com/llvm/llvm-project/pull/78304.	2024-05-02 21:00:40 +01:00
Florian Hahn	9c3f5fe88f	[LV] Don't consider the latch block as ScalarPredicatedBB. The conditional branch from the loop latch will be replaced by a single branch controlling the loop, so there is no extra overhead from scalarization. This improves the cost esimates in some cases.	2024-04-29 19:15:46 +01:00
David Green	d486a4c29a	[ARM] Ensure extra uses are not dead in tail-folding-counting-down.ll. NFC This might help keep the test valid if vplan is removing dead intructions.	2024-04-29 15:47:24 +01:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Florian Hahn	b6a8f5486b	[LV] Consider all exit branch conditions uniform. If we vectorize a loop with multiple exits, all exiting branches should be considered uniform, as the resulting loop will be controlled by the canonical IV only. Previously we were overestimating the cost of values contributing to the other exits.	2024-04-28 13:15:55 +01:00

1 2 3 4 5 ...

2458 Commits