llvm-project

Author	SHA1	Message	Date
Alexey Bataev	641939baa9	[SLP]Remove CreateShuffle lambda and reuse ShuffleBuilder functions. After merging main part of the gather/buildvector code, CreateShuffle lambda can removed and ShuffleBuilder add functions can be used instead. Also, part of the code from CreateShuffle migrated to createShuffle of the BaseShuffleAnalysis::createShuffle function for better code emission. Differential Revision: https://reviews.llvm.org/D145988	2023-03-14 10:15:41 -07:00
Alexey Bataev	874c49f554	[SLP]Fix PR61395: need to adjust vector factor after emitting shuffle operation for combined entries. The vector factor after combining of the shuffle entries is defined by the size of the mask, not by the vector factors of the original entries. So, need to adjust it to emit correct code.	2023-03-14 06:27:08 -07:00
Sjoerd Meijer	775451b66a	[AArch64] Cost-model vector splat LD1Rs to avoid unprofitable SLP vectorisation This slightly increases the costs of InsertElement instructions that are part of a vector splat sequence, i.e. a load, InsertElement and a shuffle (load + dup). The resulting LD1R is a high latency instruction, and this slight increase in costs avoids SLP vectorisation for a couple of cases where this isn't profitable. Fixes: https://github.com/llvm/llvm-project/issues/61047 Differential Revision: https://reviews.llvm.org/D145578	2023-03-13 14:52:09 +00:00
Alexey Bataev	93a9be0cea	[SLP]Initial support for reshuffling of non-starting buildvector/gather nodes. Previously only the very first gather/buildvector node might be probed for reshuffling of other nodes. But the compiler may do the same for other gather/buildvector nodes too, just need to check the dependency and postpone the emission of the dependent nodes, if the origin nodes were not emitted yet. Part of D110978 Differential Revision: https://reviews.llvm.org/D144958	2023-03-10 13:19:43 -08:00
Alexey Bataev	395c11f7b8	[SLP][NFC]Add a test with phi nodes in one tree node with different order of incoming basic blocks, NFC.	2023-03-10 12:19:08 -08:00
Alexey Bataev	d84e971f48	[SLP][NFC]Add a test with multilevel dependency between buildvector nodes, NFC.	2023-03-10 10:33:03 -08:00
Alexey Bataev	151d3b607e	[SLP][NFC]Update/simplify test to avoid dead code elimination.	2023-03-10 08:12:53 -08:00
Hans Wennborg	3b3a4c270b	Revert "[SLP]Initial support for reshuffling of non-starting buildvector/gather nodes." This caused verifier errors: Instruction does not dominate all uses! %8 = insertelement <2 x i64> %7, i64 %pgocount1330, i64 1 %15 = shufflevector <2 x i64> %8, <2 x i64> poison, <2 x i32> <i32 1, i32 1> in function ?NearestInclusiveAncestorAssignedToSlot@SlotScopedTraversal@blink@@SAPAVElement@2@ABV32@@Z (or register allocator crash when the verifier was disabled). See comment on the code review. > Previously only the very first gather/buildvector node might be probed for reshuffling of other nodes. > But the compiler may do the same for other gather/buildvector nodes too, just need to check the > dependency and postpone the emission of the dependent nodes, if the origin nodes were not emitted yet. > > Part of D110978 > > Differential Revision: https://reviews.llvm.org/D144958 This reverts commit a611b3f3059e4c3b9e7b914091c3edaef099fd5d. It also reverts 7a4061ae372b3262703ffeea3b64db89187db611 which depended on the above.	2023-03-10 14:40:12 +01:00
Ben Shi	013235a200	[RISCV][NFC] Add tests for SLP vectorization of math functions RISCV has "vfabs.v" and "vfsqrt.v" so math functions abs and sqrt can be SLP vectorized. But others exp/log/sin/asin/sinh/asinh/... can not. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D145562	2023-03-10 07:34:21 +08:00
Alexey Bataev	7a4061ae37	[SLP][NFC]Update/simplify test to avoid dead code elimination.	2023-03-08 13:49:25 -08:00
Alexey Bataev	a611b3f305	[SLP]Initial support for reshuffling of non-starting buildvector/gather nodes. Previously only the very first gather/buildvector node might be probed for reshuffling of other nodes. But the compiler may do the same for other gather/buildvector nodes too, just need to check the dependency and postpone the emission of the dependent nodes, if the origin nodes were not emitted yet. Part of D110978 Differential Revision: https://reviews.llvm.org/D144958	2023-03-07 12:45:40 -08:00
Alexey Bataev	c411965820	[SLP]Fix PR61224: Compiler hits infinite loop. IRBuilder in many cases is able to fold constant code automatically, but in some cases (for some intrinsics) it cannot do it. Need to perform manual calculation, if constant provided in these corner cases, to avoid infinite loop.	2023-03-06 13:46:41 -08:00
Alexey Bataev	6b9be26207	[SLP][NFC]Update the test to avoid dead code elimination, NFC.	2023-03-03 06:10:15 -08:00
Alexey Bataev	931bba2bc3	[SLP][NFC]Add a test with reused scalars in 3 tree nodes with different VF, NFC.	2023-03-02 10:50:03 -08:00
Alexey Bataev	4e4ad3ab0e	[SLP][NFC]Update the test to simplify and avoid dead instruction removal, NFC.	2023-03-01 06:35:56 -08:00
Alexey Bataev	1d6b5b66bb	[SLP]Fix PR61050: Assertion `I->use_empty() && "trying to erase instruction with users." When gathering the counter for the reused scalars, need to use reduced value, not the original reduced value. Same values counter is gathered for reduced values, not original ones.	2023-02-28 07:51:34 -08:00
Vasileios Porpodas	a700fb3d9b	[SLP] Fixes crash in BoUpSLP::isGatherShuffledEntry() Crash caused by: 708eb1b96d9a36f9c0182b7d53c492059778fa35 Differential Revision: https://reviews.llvm.org/D144895	2023-02-27 12:29:25 -08:00
Alexey Bataev	007177bdde	[SLP]Fix PR61018: Assertion `Mask[I] == UndefMaskElem && "Multiple uses of scalars."' failed. Need to check for the reused indices when checking if 2 insertelement instruction are from the same buildvector. If the inidices are reused, better not to match buildvectors and consider them as differenet, otherwise need to track the order of insertelement operations.	2023-02-27 10:09:48 -08:00
Alexey Bataev	5f53e85f8a	[SLP]Fix a crash when trying to find reduced ops for the reduced value. Need to use original reduced value, not the one the compiler gets after reduction, it may be replaced by the extractelement instruction already.	2023-02-27 07:32:36 -08:00
Alexey Bataev	f1c8b72c13	[SLP]Improve handling gathers/buildvectors with undefs. If have just one non-undef scalar in the buildvector/gather node, we try to put it to be the very first element, which is profitable in most cases. Do the preliminary estimation, if this more profitable during graph rotation and do same for all elements, including extractelements. Differential Revision: https://reviews.llvm.org/D144689	2023-02-24 13:17:40 -08:00
Jonas Paulsson	1387a13e1d	[SLP] Check with target before vectorizing GEP Indices. The target hook prefersVectorizedAddressing() already exists to check with target if address computations should be vectorized, so it seems like this should be used in SLPVectorizer as well. Reviewed By: ABataev, RKSimon Differential Revision: https://reviews.llvm.org/D144128	2023-02-23 15:31:34 +01:00
Alexey Bataev	cbcdd747e8	[SLP]Do not swap not counted extractelements. No need to swap extractelements, which were not excluded from the list during cost analysis. It leads to incorrect cost calculation and make vector code more profitable than it is actually is.	2023-02-21 13:16:51 -08:00
Alexey Bataev	677ea15e35	[NFC][SLP]Add a test for optimistic vectorization, NFC.	2023-02-21 11:02:32 -08:00
Alexey Bataev	5f928a223e	[SLP]Properly define incoming block for user PHI nodes. MainOp of the PHI vectorizable entries contains the proper order of incoming blocks, not the last instruction in the block.	2023-02-21 08:01:24 -08:00
Simon Pilgrim	2ca266dc1a	[SLP][X86] minimum-sizes.ll - add AVX512 test coverage As noticed on D144128, we need better AVX512 coverage for GEP vectorization	2023-02-20 23:31:56 +00:00
Simon Pilgrim	d9bceeedbf	[SLP][X86] load-merge.ll - add AVX512 test coverage As noticed on D144128, we need better AVX512 coverage for GEP vectorization	2023-02-20 23:21:33 +00:00
Ricardo Jesus	287267c23a	[AArch64] Add SLP test for abs (NFC) Differential Revision: https://reviews.llvm.org/D144376	2023-02-20 14:50:06 +00:00
Alexey Bataev	708eb1b96d	[SLP]Add shuffling of extractelements to avoid extra costs/data movement. If the scalar must be extracted and then used in the gather node, instead we can emit shuffle instruction to avoid those extra extractelements and vector-to-scalar and back data movement. Part of D110978 Differential Revision: https://reviews.llvm.org/D141940	2023-02-20 06:14:42 -08:00
Florian Hahn	f61c9b7569	[SLP] Fix infinite loop in isUndefVector. This fixes an infinite loop if isa<T>(II->getOperand(1)) is true. Update Base at the top of the loop, before the continue. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D144292	2023-02-19 21:42:24 +00:00
Alexey Bataev	e03d254bbd	[SLP]Do not reduce repeated values, use scalar red ops instead. Metric: size..text size..text results results0 diff SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-980605-1.test 445.00 461.00 3.6% SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 428477.00 428445.00 -0.0% External/SPEC/CFP2006/447.dealII/447.dealII.test 618849.00 618785.00 -0.0% For all tests some extra code was optimized, GCC-C-execute has some more inlining after Differential Revision: https://reviews.llvm.org/D132261	2023-02-17 07:19:35 -08:00
Alexey Bataev	9bdcf8778a	[SLP]Improve isGatherShuffledEntry by looking deeper through the reused scalars. The compiler may produce better results if it does not look for constants, uses an extra analysis of phi nodes, looks through all tree nodes without skipping the cases, where the very first set of nodes is empty. Also, it tries to reshufle the nodes if it is profitable for sure, i.e. at least 2 scalars are used for single node permutation and at least 3 scalars are used for the permutation of 2 nodes. Part of D110978 Differential Revision: https://reviews.llvm.org/D141512	2023-01-19 13:46:25 -08:00
Valery N Dmitriev	d1fbe2ba6d	[SLP] Remove unused check label from test - NFC	2023-01-13 16:00:43 -08:00
Valery N Dmitriev	fd7273359a	[SLP] Do not ignore ordering for root node when it has in-tree uses. When rooted with PHIs, a vectorization tree may have another node with PHIs which have roots as their operands. We cannot ignore ordering information for root in such a case. Differential Revision: https://reviews.llvm.org/D141309	2023-01-10 10:12:51 -08:00
Alexey Bataev	7439e1b2de	[SLP]Fix incorrect reordering of clustered scalars. The new mask represents the order, not the mask itself. At first, need to treat as the order, convert to mask and only after that reorder gathered scalars to build correct clustered order. Differential Revision: https://reviews.llvm.org/D141161	2023-01-06 16:04:09 -08:00
Alexey Bataev	9b5f62685a	[SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector. Differential Revision: https://reviews.llvm.org/D140498	2023-01-06 09:25:05 -08:00
Valery N Dmitriev	6d677c0b3d	[SLP] Unify GEP cost modeling for load, store and GEP nodes. Make a separate routine for GEPs cost calculation and make the approach uniform across load, store and GEP tree nodes. Additional issue fixed is GEP cost savings were applied twice for ScatterVectorize nodes (aka gather load) making them look unrealistically profitable for vectorization. Differential Revision: https://reviews.llvm.org/D140789	2023-01-05 10:11:36 -08:00
Nikita Popov	b061159e79	[SLPVectorizer] Convert test to opaque pointers (NFC)	2023-01-05 12:32:44 +01:00
Alexey Bataev	a1b18946f9	[SLP]Fix incorrect shuffle results because of missing shuffle mask analysis. Missed the analysis of the shuffle mask when trying to analyze the operands of the shuffle instruction during peeking through shuffle instructions.	2023-01-04 13:10:40 -08:00
Alexey Bataev	352b660c1b	[SLP][NFC]Add a pass.	2023-01-04 10:30:48 -08:00
Alexey Bataev	53a858f7fc	[SLP][NFC]Add a test for incorrect skipping of shuffle instruction at peek-through-shuffles, NFC.	2023-01-04 10:17:03 -08:00
Nikita Popov	51ba34708d	[SLPVectorizer] Convert test to opaque pointers (NFC)	2023-01-04 16:39:51 +01:00
Nikita Popov	8383da1583	[SLPVectorizer] Name instructions in test (NFC)	2023-01-04 16:35:45 +01:00
Nikita Popov	a34ae06c20	[SLPVectorizer] Convert some tests to opaque pointers (NFC)	2023-01-04 16:34:39 +01:00
Dinar Temirbulatov	55c600819f	[SLP][AArch64] Incorrectly estimated intrinsic as a function call. We incorrectly assume intrinsic as a function call and it prevents us from the opportunity to vectorize. On Aarch64 Cortex-A53 we think that llvm.fmuladd.f64 is a function call which is wrong. Differential Revision: https://reviews.llvm.org/D140392	2023-01-03 19:45:24 +00:00
Alexey Bataev	26fec4e845	[SLP]Fix crash on casting non-instruction extractelement. Need to check if the extractelement operation is an extraction before trying to move it around the buildblocks to avoid crash on cast.	2023-01-03 09:45:57 -08:00
Dinar Temirbulatov	3c205efe8b	[SLP][AArch64] Add fmuladd test coverage	2023-01-03 11:28:18 +00:00
Valery N Dmitriev	6bb4b2d002	[NFC] Test case intended to cover SLP cost for chain with masked gather loads. SLP produces two gather loads (one feeds another). For the first set of scalar loads GEP indices are all constant. The result of the second load is then fed into reduction (as a seed). Differential Revision: https://reviews.llvm.org/D140785	2022-12-30 12:27:34 -08:00
Alexey Bataev	5dccea5a68	[SLP]Do not emit many extractelements, reuse the single one emitted. We do not need to emit many extractelements for each particular use, we can reuse the only one, just need to adjust it to make it dominate on all uses. Differential Revision: https://reviews.llvm.org/D140580	2022-12-30 06:38:06 -08:00
Alexey Bataev	ac01ae71f0	[SLP]Use ShuffleInstructionBuilder for vector shrinking. We can use ShuffleInstructionBuilder now for shrinking shuffle emission. It allows to remove extra shuffle from the emitted code and reuse original vector. Part of D110978 Differential Revision: https://reviews.llvm.org/D140499	2022-12-28 06:09:04 -08:00
Alexey Bataev	a9b052e2ef	[SLP]Fix PR59693: Do not crash trying to set insert point for buildvector of extractvalues. No need to get the last instruction only for vectorized extractvalues, for gathered(buildvector sequence) still need to get the insertion point.	2022-12-27 06:01:38 -08:00

1 2 3 4 5 ...

1327 Commits