llvm-project

Author	SHA1	Message	Date
Graham Hunter	b070629c10	[LV] Increase max VF if vectorized function variants exist (#66639 ) If there are function calls in the candidate loop and we have vectorized variants available, try some wider VFs in case the conservative initial maximum based on the widest types in the loop won't actually allow us to make use of those function variants.	2023-11-13 10:27:10 +00:00
Florian Hahn	34c2dcd5ac	[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC) This patch moves creating the middle VPBBs and an initial empty vector loop region for the top-level loop to createInitialVPlan. This consolidates code to create the initial VPlan skeleton and enables adding other bits outside the main region during initial VPlan construction. In particular, D150398 will add the exit check & branch to the middle block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158333	2023-11-12 13:00:44 +00:00
Michael Maitland	acef83c142	[VectorCombine] Fix crash in scalarizeVPIntrinsic (#72039 ) When getSplatOp returns nullptr, the intrinsic cannot be scalarized. This patch includes a test case that fixes a crash from trying to scalarize the VPIntrinsic when getSplatOp returns nullptr. This fixes https://github.com/llvm/llvm-project/issues/72034.	2023-11-11 19:54:15 -05:00
Florian Hahn	ed6f4994d8	[VPlan] Handle conditional ordered reductions with scalar VFs. VPReductionRecipe::execute was not handling predicates for ordered reduction with scalar VFs, which was causing a crash. Thsi patch adds dedicated handling for scalar VFs when dealing with the condition. The other operands are already handled in a similar fashion below. Fixes #70988.	2023-11-11 12:55:40 +00:00
Tom Stellard	2400c54c37	[Vectorize] Remove Transforms/Vectorize.h (#71294 ) The only thing in this file is a declaration for createLoadStoreVectorizerPass(), and this function is already declared in LoadStoreVectorizer.h.	2023-11-06 14:04:22 -08:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Florian Hahn	a002271972	[VPlan] Add VPValue::replaceUsesWithIf (NFCI). Add replaceUsesWithIf helper and use it in a few places.	2023-11-06 16:08:22 +00:00
Alexey Bataev	ac254fc055	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-06 07:29:27 -08:00
Hans Wennborg	046c57e705	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This causes asserts: llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10082: Value llvm::slpvectorizer::BoUpSLP::ShuffleInstructionBuilder::adjustExtracts( const TreeEntry , MutableArrayRef<int>, unsigned int, bool &): Assertion `Part == 0 && "Expected firs part."' failed. See comment on the code review. > Currently tryToGatherExtractElements function analyzes the whole vector, > regrdless number of actual registers, used in this vector. It may > prevent some optimizations, because per-register analysis may allow to > simplify the final code by reusing more already emitted vectors and > better shuffles. > > Differential Revision: https://reviews.llvm.org/D148855 This reverts commit 9dfdbd788707edc8c39eb2bff16004aba1f3586b.	2023-11-06 13:56:42 +01:00
Alexey Bataev	9dfdbd7887	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-03 10:43:58 -07:00
Florian Hahn	fd82b5b287	[LV] Support recieps without underlying instr in collectPoisonGenRec. Support recipes without underlying instruction in collectPoisonGeneratingRecipes by directly trying to dyn_cast_or_null the underlying value. Fixes https://github.com/llvm/llvm-project/issues/70590.	2023-11-03 10:21:14 +00:00
Martin Storsjö	66152f4eed	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This reverts commit 3e6d7c6d983dd5896e3a03857584654eb1360fda. That commit caused miscompilation of ffmpeg's libavcodec/vp9dsp_8bpp.o on aarch64; the file still compiles correctly, but no longer produces the right result - see https://reviews.llvm.org/D148855#4655968 for details.	2023-11-03 00:08:17 +02:00
Alexey Bataev	3026c13612	[SLP][NFC]Remove commented out code, NFC.	2023-11-02 11:06:56 -07:00
Alexey Bataev	495ed8d8c8	[SLP]Fix PR70507: freeze poisonous insts to avoid poison propagation. If the reduction instruction is not bool logical op, but reduced within bool logical op reduction list, need to freeze to avoid poison propagation.	2023-11-02 10:37:38 -07:00
Alexey Bataev	3e6d7c6d98	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-01 10:42:35 -07:00
Alexey Bataev	6e8d957a22	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This reverts commit 0a34aaedd8ec2dc2375076976c1327fdbfd7877f to fix fails reported in https://lab.llvm.org/buildbot/#/builders/265/builds/40	2023-11-01 08:52:31 -07:00
Alexey Bataev	c28b7eb496	[SLP]Fix handling of -slp-vectorize-hor-store for values with many uses.	2023-11-01 08:41:54 -07:00
Alexey Bataev	0a34aaedd8	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-01 07:44:49 -07:00
Nikita Popov	79bd1d8a95	[SLPVectorizer] Avoid use of ConstantExpr::getIntegerCast() (NFC) We are working on a ConstantInt here, so folding will always succeed. This just avoids use of the ConstantExpr API.	2023-11-01 12:25:55 +01:00
Alexey Bataev	4c997e1536	[SLP]Fix PR70507: emit freeeze whenever required for bool logical ops in the middle of reduction ops. Need to emit freeze instruction not only in the case, where the root is bool logical op, but also if we reduce several scalars, but unable to say precisely, if the root is bool logical op.	2023-10-31 12:23:12 -07:00
Nikita Popov	6a06155c53	[VectorCombine] Discard ScalarizationResults if transform aborted Fixes https://github.com/llvm/llvm-project/issues/69820.	2023-10-31 11:24:30 +01:00
Alexey Bataev	9da19e4340	[SLP]Fix PR70507: correctly handle bool logical ops in reductions. If the very first reduction operation is not bool logical op, but some others are, still need to emit the boo logic op for all the extra reduction operations to avoid incorrect poison propagation.	2023-10-30 14:09:08 -07:00
Alexey Bataev	af15c46777	[SLP]Do not crash if number of vector registers does not feet the vector type. Need to check, if the number of vector registers, returned by TTI, is not greater than total number of mask element and not zero, before trying to perform any operations. TTI still may return non-valid number of registers.	2023-10-30 07:30:52 -07:00
Igor Kirillov	70904226e1	[LoopVectorize] Enhance Vectorization decisions for predicate tail-folded loops with low trip counts (#69588 ) * Avoid using `CM_ScalarEpilogueNotAllowedLowTripLoop` for loops known to be predicate tail-folded, delegating to `areRuntimeChecksProfitable` to decide on the profitability of vectorizing loops with runtime checks. * Update the `areRuntimeChecksProfitable` function to consider the `ScalarEpilogueLowering` setting when assessing vectorization of a loop. With this patch, we can make more informed decisions for loops with low trip counts, especially when leveraging Profile-Guided Optimization (PGO) data.	2023-10-30 13:43:26 +00:00
Florian Hahn	b0b88643a1	[VPlan] Add initial anlysis to infer scalar type of VPValues. (#69013 ) This patch adds initial type inferrence for VPValues. It infers the scalar type of a VPValue, by bottom-up traversing through defining recipes until root nodes with known types are reached (e.g. live-ins or load recipes). The types are then propagated top down through operations. This is intended as building block for a VPlan-based cost model, which will need access to type information for VPValues/recipes. Initial testing is done by asserting the inferred type matches the type of the result value generated for a widen and replicate recipes.	2023-10-27 14:38:28 +01:00
Florian Hahn	cff6652129	[VPlan] Handle VPValues without underlying values in getTypeForVPValue. Fixes a crash after 0c8e5be6fa08. Full type inference will be added in https://github.com/llvm/llvm-project/pull/69013	2023-10-27 13:34:54 +01:00
Alexey Bataev	196d154ab7	[SLP]Improve isGatherShuffledEntry by trying per-register shuffle. Currently when building gather/buildvector node, we try to build nodes shuffles without taking into account separate vector registers. We can improve final codegen and the whole vectorization process by including this info into the analysis and the vector code emission, allows to emit better vectorized code. Differential Revision: https://reviews.llvm.org/D149742	2023-10-26 08:51:37 -07:00
Alexey Bataev	c65ec9d919	Revert "[SLP]Improve isGatherShuffledEntry by trying per-register shuffle." This reverts commit 560bad013ebcb8d2c2c1722e35270b9a70ab40ce to fix a bug reported in https://lab.llvm.org/buildbot/#/builders/5/builds/37763.	2023-10-26 08:36:50 -07:00
Alexey Bataev	560bad013e	[SLP]Improve isGatherShuffledEntry by trying per-register shuffle. Currently when building gather/buildvector node, we try to build nodes shuffles without taking into account separate vector registers. We can improve final codegen and the whole vectorization process by including this info into the analysis and the vector code emission, allows to emit better vectorized code. Differential Revision: https://reviews.llvm.org/D149742	2023-10-26 05:57:03 -07:00
Kazu Hirata	f9306f6de3	[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156 ) C++20 comes with std::erase to erase a value from std::vector. This patch renames llvm::erase_value to llvm::erase for consistency with C++20. We could make llvm::erase more similar to std::erase by having it return the number of elements removed, but I'm not doing that for now because nobody seems to care about that in our code base. Since there are only 50 occurrences of erase_value in our code base, this patch replaces all of them with llvm::erase and deprecates llvm::erase_value.	2023-10-24 23:03:13 -07:00
Valery Dmitriev	3324776d9c	[SLP] Improve gather tree nodes matching when users are PHIs. (#70111 ) This is re-commit of #69392 and also fixes issue #69670 which was uncovered with the prior commit. For delayed gather emission it may be incorrect to use stab instruction as insertion point if it is a PHI operand. For that case insertion point is adjusted to be at the end of block, ensuring that prior dependecy vector code is emitted earlier.	2023-10-24 16:39:36 -07:00
Alexey Bataev	a3c68754b0	[SLP][NFC]Remove unused variables, NFC.	2023-10-24 14:36:33 -07:00
Alexey Bataev	d79051f894	[SLP]Fix PR70004: Do not change insert point for reduction gather nodes. No need to change the insert point for reduction gather node, we can use the ReductionRoot as insert point instead to avoid possible crashes.	2023-10-24 09:19:59 -07:00
Alexey Bataev	8d307f59ee	[SLP]Fix PR69246: do not treat resizing maskas identity. If the mask is resizing and the mask size is greater than than the length of the vector, being reused from extractelement instructions, the mask for undefs cannot be treated as identity, must be treated as a broadcast.	2023-10-24 08:14:13 -07:00
Nabeel Omer	8e31acf8ca	[VectorCombine] Add special handling for truncating shuffles (#70013 ) When dealing with a truncating shuffle, we can end up in a situation where the type passed to getShuffleCost is the type of the result of the shuffle, and the mask references an element which is out of bounds of the result vector. If dealing with truncating shuffles, pass the type of the input vectors to `getShuffleCost()` in order to avoid an out-of-bounds assertion.	2023-10-24 15:03:43 +01:00
Alexey Bataev	254558ac53	[SLP]Fix PR69976: Check for multi-node uses during node building. Need to check if there is already a node created for the multi-node instruction before ending up with creating a new node for such instructions.	2023-10-24 07:01:46 -07:00
Hans Wennborg	e2fc68c3db	Typos: 'maxium', 'minium'	2023-10-23 10:42:28 +02:00
Kazu Hirata	3af0ff99b1	[llvm] Stop including llvm/ADT/DepthFirstIterator.h (NFC) Identified with misc-include-cleaner.	2023-10-22 12:15:46 -07:00
Florian Hahn	4f56d47d05	[VPlan] Make ExpandedSCEVs argument const (NFC). The argument is only used in const contexts. Simplifies a follow-up diff.	2023-10-22 12:31:55 +01:00
Florian Hahn	0c8e5be6fa	[VPlan] Simplify redundant trunc (zext A) pairs to A. Add simplification for redundant trunc(zext A) pairs. Generally apply a transform from D149903. Depends on D159200. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D159202	2023-10-22 11:41:38 +01:00
Florian Hahn	ca01f2af78	[LV] Enforce order of reductions with intermediate stores in VPlan (NFC) Reductions with intermediate stores currently need to be fixed in order of their intermediate stores. Instead of doing this at fixup time after code has been generated, sort the reductions in adjustRecipesForReductions. This makes the order explicit in VPlan and will enable removing fixReductions with modeling computing the final reduction result in VPlan, followed by also modeling the intermediate stores explicitly.	2023-10-21 21:26:52 +01:00
Lou Knauer	852bac4439	[VPlan] Support scalable vectors in outer-loop vectorization This patch enables scalable vectors in the VPlan-native path. If a vectorization factor is specified via loop vectorization hints, that factor is used. If no vectorization factor is specified, but the target preferes scalable vectorization, a scalable vectorization factor is selected. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D157484	2023-10-20 23:17:35 +01:00
Douglas Yung	734b016b66	Revert "[SLP] Improve gather tree nodes matching when users are PHIs. (#69392 )" This reverts commit c80b50349648dcf7fcbf4ae69c62b3d34bee0c70. This change causes a fatal error in the backend and is filed as issue #69670.	2023-10-20 10:59:07 -07:00
Florian Hahn	2ec7bba77b	Recommit "[VPlan] Insert Trunc/Exts for reductions directly in VPlan." This reverts commit e4ea0997486000b460c4875a00301b73b3c0d6a7. The recommit fixes a reported crash by adding a missing check to make sure the cast recipes are only introduced when vectorizing. Test coverage added in 3cac608fbd0811b2f5c59c6e13148162ccd8543e. Original commit message: Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions. This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.	2023-10-20 14:30:04 +01:00
Luke Lau	c35939b22e	[VectorCombine] Use isSafeToSpeculativelyExecute to guard VP scalarization (#69494 ) Previously we were just matching against a fixed list of VP intrinsics that we knew couldn't be speculated, but we can reuse the logic in isSafeToSpeculativelyExecuteWithOpcode. This also allows speculation in more cases, e.g. when the divisor is known to be non-zero. Unfortunately we can't reuse the exact same function call for VP intrinsics with functional intrinsics instead of opcodes, because isSafeToSpeculativelyExecute needs an instruction that already exists. So this just copies the logic by peeking into the function attributes of the intrinsic.	2023-10-19 12:45:21 -04:00
Fangrui Song	e4ea099748	Revert "[VPlan] Insert Trunc/Exts for reductions directly in VPlan." This reverts commit fd311126349b8fe1684d62154a9fa5a7bbb0b713. There are two different crash reports on `fd31112634`	2023-10-18 23:25:31 -07:00
Alexey Bataev	4a06332e45	[SLP][NFC]Use MutableArrayRef instead of SmallVectorImpl&, rename function, NFC.	2023-10-18 13:09:20 -07:00
Alexey Bataev	3ef271c3d6	[SLP][NFC]Use MutableArrayRef instead of SmallVectorImpl& in param, NFC.	2023-10-18 09:47:07 -07:00
Valery Dmitriev	c80b503496	[SLP] Improve gather tree nodes matching when users are PHIs. (#69392 )	2023-10-18 09:05:11 -07:00
Valery Dmitriev	9aa571f080	[SLP][NFC] Try to cleanup and better document some isGatherShuffledEntry code. (#69384 ) Outline some often used common code to dedicated variables in order to make code compact. Rename variables to more accurately reflect their purpose. Apply const qualifier where appropriate. Fix and add bit more explanation comment for the existing code.	2023-10-17 14:59:36 -07:00

1 2 3 4 5 ...

4068 Commits