llvm-project

Author	SHA1	Message	Date
Alexey Bataev	279b1ea65f	[SLP]Improve gathering of the scalars used in the graph. Currently we emit gathers for scalars being vectorized in the tree as a pair of extractelement/insertelement instructions. Instead we can try to find all required vectors and emit shuffle vector instructions directly, improving the code and reducing compile time. Part of non-power-of-2 vectorization. Differential Revision: https://reviews.llvm.org/D110978	2023-12-01 11:23:57 -08:00
Alexey Bataev	ba52310657	[SLP][NFC] Unify code for cost estimation/codegen for buildvector, NFC. (#73182 ) This just moves towards reusing same function for both cost estimation/codegen for buildvector.	2023-11-30 10:04:57 -05:00
Alexey Bataev	1f88e62db4	[SLP]Fix/improve minbitwidth mapping to use TreeEntry as a key. Currently, MinBWs map uses Value* as a key and stores mapping for each value to be demoted. It make is it hard to get the actual MinBWs value for the buildvector scalars(constants), since same constant might be used in different nodes with the different MinBWs values/decisions. Also, it consumes extra memory for the vectorized values/instructions from the same nodes. Better to map actual nodes. It fixes the bitwidth data fetching for buildvector scalars and improves memory consumption/analysis time for other instructions.	2023-11-30 06:33:31 -08:00
Alexey Bataev	447da954c7	[SLP][NFC]Use DenseSet instead of SetVector, NFC. For CSEBlocks we can safely use DenseSet, the order should not be preserved for this container.	2023-11-28 11:27:49 -08:00
Alexey Bataev	badec9b7bf	[SLP][NFC]Fix loops variables names, NFC.	2023-11-28 10:30:19 -08:00
Alexey Bataev	c72884225b	[SLP][NFC]Fix naming of variables/functions, NFC.	2023-11-28 09:15:38 -08:00
Alexey Bataev	b6eb740cae	[SLP][NFC]Improve/fix auto declarations, NFC.	2023-11-28 07:39:21 -08:00
Alexey Bataev	45139ab6ca	[SLP][NFC]Improve aliasing support in SLP, NFC. No need to store optional boolean in the map, enough to store boolean directly. Also, we can do preliminary check for instruction and if they are not simple, mark as aliased without storing this result in the map.	2023-11-28 07:24:44 -08:00
Alexey Bataev	e9fdb965f9	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-24 08:05:19 -08:00
Alexander Kornienko	af7a145352	Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using" This reverts commit 52df67ba76a03ad33132d1d4f4202d5a2313a3cd, which causes spurious clang crashes. See `52df67ba76 (commitcomment-133381701)`	2023-11-24 01:18:46 +01:00
Alexey Bataev	53f912480f	[SLP][NFC]Remove extra unused vars, add TODO, NFC.	2023-11-22 12:26:54 -08:00
Alexey Bataev	12bcd6339d	[SLP]Improve detection of gathered loads, if no other deps are detected. If the gather node includes ordered loads only partially (not the whole node consists of loads) and the other gathered scalar are not loads, and no other dependency from other nodes is found, we still can improve the cost of gather, if take into account the fact that these loads still can be vectorized.	2023-11-22 11:35:51 -08:00
Alexey Bataev	369c0eb55b	[SLP][NFC]Use SmallVector instead of std::vector and remove unused includes, NFC.	2023-11-22 08:11:27 -08:00
Alexey Bataev	f609d4ba1d	[SLP]Fix PR72833: do not crash if only operand is casted but the use instruction. Need to check if only operand is casted, not the user instruction itself, if the types of the operands does not match the actual type.	2023-11-20 08:35:35 -08:00
Alexey Bataev	40e46b6eff	[SLP]Do not emit int bitcast after minbitwidth analysis. No need to emit bitcat op for integer operands if it is detected that after minbitwidth analysis the type is the same.	2023-11-20 06:25:17 -08:00
Alexey Bataev	52df67ba76	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-17 13:45:28 -08:00
Arthur Eubanks	6a126e279d	Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using" This reverts commit cfd0f41f4effb5d31654dcb28c1a577c152ee23b. Causes crashes, see `cfd0f41f4e`.	2023-11-17 13:23:38 -08:00
Valery Dmitriev	94e86751e5	[NFC][SLP] Remove unnecessary DL argument (#72674 )	2023-11-17 10:08:25 -08:00
Alexey Bataev	cfd0f41f4e	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-17 08:05:02 -08:00
Alexey Bataev	72b97630bc	[SLP][NFC]Fix comparison of integers of different signs warning, NFC.	2023-11-16 17:18:28 -08:00
Alexey Bataev	cb678708e6	[SLP][NFC]Add TreeEntry-based add member functions and use them, where possible, NFC.	2023-11-16 16:30:52 -08:00
Alexey Bataev	484a27e412	[SLP][NFC]Make needToDelay constant, NFC.	2023-11-16 16:11:43 -08:00
Alexey Bataev	009002a8cb	[SLP][NFC]Unify matching for perfect diamond match between cost and codegen models, NFC.	2023-11-16 08:11:52 -08:00
Alexey Bataev	206799fcf5	[SLP]Fix PR72524: "Out-of-bounds shuffle mask element" failed. Need to check if we ran into subvector extract pattern before checking for identity vector to avoid compiler crash.	2023-11-16 07:39:32 -08:00
Alexey Bataev	95703642e3	[SLP]Fix PR72202: wrong mask emission for the first found vector operand. Need to copy the submask not to the very first part of the common extractelements vector mask, but to the proper one to avoid wrong code emission.	2023-11-16 07:01:05 -08:00
Alexey Bataev	8ea8dd9a01	[SLP] Fix crash on trying to reshuffle a scalar that was vectorized. If the buildvector node contains extractelement, which vector operand depends on vector node, need to check if the node is ready and use vectorized value instead of the original vector operation.	2023-11-15 11:01:45 -08:00
Alexey Bataev	d202b00826	[SLP][NFC] Make tryToGather[SingleRegister]ExtractElements routines BoUpSLP methods.	2023-11-15 09:47:24 -08:00
Alexey Bataev	b6f51787f6	[SLP]Fix signedness analysis for scalars in graph. Cannot use the sign info for the roots for all scalars in the graph, need to perform the analysis for each particular scalar (tree node).	2023-11-15 07:10:59 -08:00
Alexey Bataev	5adfad254e	[SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI. SLP includes analysis for the minimum bitwidth, the actual integer operations can be emitted. It allows to reduce register pressure and improve perf. Currently, it includes only cost model and the next transformation relies on InstructionCombiner. Better to do it directly in SLP, it allows to reduce compile time and fix cost model issues.	2023-11-14 11:12:52 -08:00
Alexey Bataev	f2f3050476	Revert "[SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI." This reverts commit f6ae50f710d02d8553d28192a1f048b2a9e1fc4d to fix a crash revealed in the internal testing.	2023-11-14 09:45:54 -08:00
Alexey Bataev	f6ae50f710	[SLP]Emit actual bitwidth for analyzed MinBitwidth nodes, NFCI. SLP includes analysis for the minimum bitwidth, the actual integer operations can be emitted. It allows to reduce register pressure and improve perf. Currently, it includes only cost model and the next transformation relies on InstructionCombiner. Better to do it directly in SLP, it allows to reduce compile time and fix cost model issues.	2023-11-14 07:57:37 -08:00
Alexey Bataev	d4cec1ce73	[SLP][NFCI]Improve compile time by using SmallBitVector and filtering trees with phis/buildvectors only.	2023-11-14 06:27:17 -08:00
Alexey Bataev	ac254fc055	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-06 07:29:27 -08:00
Hans Wennborg	046c57e705	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This causes asserts: llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10082: Value llvm::slpvectorizer::BoUpSLP::ShuffleInstructionBuilder::adjustExtracts( const TreeEntry , MutableArrayRef<int>, unsigned int, bool &): Assertion `Part == 0 && "Expected firs part."' failed. See comment on the code review. > Currently tryToGatherExtractElements function analyzes the whole vector, > regrdless number of actual registers, used in this vector. It may > prevent some optimizations, because per-register analysis may allow to > simplify the final code by reusing more already emitted vectors and > better shuffles. > > Differential Revision: https://reviews.llvm.org/D148855 This reverts commit 9dfdbd788707edc8c39eb2bff16004aba1f3586b.	2023-11-06 13:56:42 +01:00
Alexey Bataev	9dfdbd7887	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-03 10:43:58 -07:00
Martin Storsjö	66152f4eed	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This reverts commit 3e6d7c6d983dd5896e3a03857584654eb1360fda. That commit caused miscompilation of ffmpeg's libavcodec/vp9dsp_8bpp.o on aarch64; the file still compiles correctly, but no longer produces the right result - see https://reviews.llvm.org/D148855#4655968 for details.	2023-11-03 00:08:17 +02:00
Alexey Bataev	3026c13612	[SLP][NFC]Remove commented out code, NFC.	2023-11-02 11:06:56 -07:00
Alexey Bataev	495ed8d8c8	[SLP]Fix PR70507: freeze poisonous insts to avoid poison propagation. If the reduction instruction is not bool logical op, but reduced within bool logical op reduction list, need to freeze to avoid poison propagation.	2023-11-02 10:37:38 -07:00
Alexey Bataev	3e6d7c6d98	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-01 10:42:35 -07:00
Alexey Bataev	6e8d957a22	Revert "[SLP]Improve tryToGatherExtractElements by using per-register analysis." This reverts commit 0a34aaedd8ec2dc2375076976c1327fdbfd7877f to fix fails reported in https://lab.llvm.org/buildbot/#/builders/265/builds/40	2023-11-01 08:52:31 -07:00
Alexey Bataev	c28b7eb496	[SLP]Fix handling of -slp-vectorize-hor-store for values with many uses.	2023-11-01 08:41:54 -07:00
Alexey Bataev	0a34aaedd8	[SLP]Improve tryToGatherExtractElements by using per-register analysis. Currently tryToGatherExtractElements function analyzes the whole vector, regrdless number of actual registers, used in this vector. It may prevent some optimizations, because per-register analysis may allow to simplify the final code by reusing more already emitted vectors and better shuffles. Differential Revision: https://reviews.llvm.org/D148855	2023-11-01 07:44:49 -07:00
Nikita Popov	79bd1d8a95	[SLPVectorizer] Avoid use of ConstantExpr::getIntegerCast() (NFC) We are working on a ConstantInt here, so folding will always succeed. This just avoids use of the ConstantExpr API.	2023-11-01 12:25:55 +01:00
Alexey Bataev	4c997e1536	[SLP]Fix PR70507: emit freeeze whenever required for bool logical ops in the middle of reduction ops. Need to emit freeze instruction not only in the case, where the root is bool logical op, but also if we reduce several scalars, but unable to say precisely, if the root is bool logical op.	2023-10-31 12:23:12 -07:00
Alexey Bataev	9da19e4340	[SLP]Fix PR70507: correctly handle bool logical ops in reductions. If the very first reduction operation is not bool logical op, but some others are, still need to emit the boo logic op for all the extra reduction operations to avoid incorrect poison propagation.	2023-10-30 14:09:08 -07:00
Alexey Bataev	af15c46777	[SLP]Do not crash if number of vector registers does not feet the vector type. Need to check, if the number of vector registers, returned by TTI, is not greater than total number of mask element and not zero, before trying to perform any operations. TTI still may return non-valid number of registers.	2023-10-30 07:30:52 -07:00
Alexey Bataev	196d154ab7	[SLP]Improve isGatherShuffledEntry by trying per-register shuffle. Currently when building gather/buildvector node, we try to build nodes shuffles without taking into account separate vector registers. We can improve final codegen and the whole vectorization process by including this info into the analysis and the vector code emission, allows to emit better vectorized code. Differential Revision: https://reviews.llvm.org/D149742	2023-10-26 08:51:37 -07:00
Alexey Bataev	c65ec9d919	Revert "[SLP]Improve isGatherShuffledEntry by trying per-register shuffle." This reverts commit 560bad013ebcb8d2c2c1722e35270b9a70ab40ce to fix a bug reported in https://lab.llvm.org/buildbot/#/builders/5/builds/37763.	2023-10-26 08:36:50 -07:00
Alexey Bataev	560bad013e	[SLP]Improve isGatherShuffledEntry by trying per-register shuffle. Currently when building gather/buildvector node, we try to build nodes shuffles without taking into account separate vector registers. We can improve final codegen and the whole vectorization process by including this info into the analysis and the vector code emission, allows to emit better vectorized code. Differential Revision: https://reviews.llvm.org/D149742	2023-10-26 05:57:03 -07:00
Kazu Hirata	f9306f6de3	[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156 ) C++20 comes with std::erase to erase a value from std::vector. This patch renames llvm::erase_value to llvm::erase for consistency with C++20. We could make llvm::erase more similar to std::erase by having it return the number of elements removed, but I'm not doing that for now because nobody seems to care about that in our code base. Since there are only 50 occurrences of erase_value in our code base, this patch replaces all of them with llvm::erase and deprecates llvm::erase_value.	2023-10-24 23:03:13 -07:00

1 2 3 4 5 ...

1547 Commits