llvm-project

Author	SHA1	Message	Date
Martin Storsjö	1de3f46938	Revert "[SLP]Do not require external uses for roots and single use for other instructions in computeMinimumValueSizes. (#72679 )" This reverts commit 408dce82016463dcb5026b2ddfc62174970a88e9. This triggered failed asserts with code like this: char a[]; short b; int c, d, e, f; void g() { char h; for (;;) { for (; f; ++f) { h[f] = b[0] * a[e] + b[c] * a[1] >> 7; ++b; } h += d; } } Compiled like this: $ clang -target x86_64-linux-gnu -c repro.c -O2 clang: ../lib/IR/Instructions.cpp:3335: static llvm::CastInst* llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, const llvm::Twine&, llvm::Instruction*): Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.	2024-01-11 12:15:35 +02:00
Alexey Bataev	408dce8201	[SLP]Do not require external uses for roots and single use for other instructions in computeMinimumValueSizes. (#72679 ) After changes, that does not require support from InstCombine, we can drop some extra requirements for values-to-be-demoted. No need to check for external uses for roots/other instructions, just check that the no non-vectorized insertelement instruction, which may require widening.	2024-01-10 14:06:29 -05:00
Alexey Bataev	73ce13d79b	[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. (#74749 ) SLP vectorizer passes the type of the subvector and the mask, which size determines the size of the resulting vector. TTI should support this pattern to improve cost estimation of the insert_subvector shuffle pattern.	2024-01-10 10:39:34 -05:00
Alexey Bataev	036e48e2f5	[SLP]Fix PR76850: do the analysis of the submask. Need to limit the transformation of the VecMask by the corresponding part of the mask of SliceSize size to avoid compiler crash during further cost analysis.	2024-01-08 07:51:02 -08:00
Alexey Bataev	79e62315be	[SLP]Use revectorized value for extracts from buildvector, beeing vectorized. When trying to reuse the extractelement instruction, emitted for the insertelement instruction, need to check, if the this insertelement instruction was vectorized. In this case, need to use vectorized value, not the original insertelement.	2024-01-04 06:45:26 -08:00
Alexey Bataev	7c963fde16	[SLP]Use revectorized value for extracts from buildvector, beeing vectorized. If the insertelement instruction is vectorized, and the extractelement instruction from such insertelement also vectorized as part of the same tree, need to extract from the corresponding for insertelement vectorized value rather than original insertelement instruction.	2024-01-03 10:38:09 -08:00
Alexandros Lamprineas	e512df3ecc	[LV] Fix crash when vectorizing function calls with linear args. (#76274 ) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an integer, floating point, or pointer type."' failed. Stack dump: llvm::FixedVectorType::get(llvm::Type, unsigned int) llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&) llvm::VPBasicBlock::execute(llvm::VPTransformState) llvm::VPRegionBlock::execute(llvm::VPTransformState) llvm::VPlan::execute(llvm::VPTransformState) ... Happens with function calls of void return type.	2024-01-02 18:14:16 +00:00
Enna1	9943d33997	[SLP][NFC] Fix assertion in vectorizeGEPIndices() (#76660 ) The index constraints for the collected getelementptr instructions should be single and non-constant.	2024-01-02 21:32:18 +08:00
Enna1	a51c2f39f5	[SLP] no need to generate extract for in-tree uses for original scala… (#76077 ) …r instruction. Before `77a609b556`, we always skip in-tree uses of the vectorized scalars in `buildExternalUses()`, that commit handles the case that if the in-tree use is scalar operand in vectorized instruction, we need to generate extract for these in-tree uses. in-tree uses remain as scalar in vectorized instructions can be 3 cases: - The pointer operand of vectorized LoadInst uses an in-tree scalar - The pointer operand of vectorized StoreInst uses an in-tree scalar - The scalar argument of vector form intrinsic uses an in-tree scalar Generating extract for in-tree uses for vectorized instructions are implemented in `BoUpSLP::vectorizeTree()`: - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11497-L11506 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11542-L11551 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11657-L11667 However, `77a609b556` not only generates extract for vectorized instructions, but also generates extract for original scalar instructions. There is no need to generate extract for origin scalar instrutions, as these scalar instructions will be replaced by vector instructions and get erased later. This patch marks there is no exact user for in-tree scalars that remain as scalar in vectorized instructions when building external uses, In this case all uses of this scalar will be automatically replaced by extractelement. and remove - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11497-L11506 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11542-L11551 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11657-L11667 extracts.	2023-12-30 10:45:26 +08:00
Alexey Bataev	5096501082	[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 ) SLP/TTI do not know about the cost estimation for addsub pattern, supported by X86. Previously the support for pattern detection was added (seeTTI::isLegalAltInstr), but the cost still did not estimated properly.	2023-12-28 05:04:04 -08:00
Douglas Yung	fb981e6b4b	Revert "[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 )" This reverts commit bc8c4bbd7973ab9527a78a20000aecde9bed652d. Change is failing to build on several bots: - https://lab.llvm.org/buildbot/#/builders/127/builds/60184 - https://lab.llvm.org/buildbot/#/builders/123/builds/23709 - https://lab.llvm.org/buildbot/#/builders/216/builds/32302	2023-12-27 23:52:04 -08:00
Alexey Bataev	bc8c4bbd79	[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 ) SLP/TTI do not know about the cost estimation for addsub pattern, supported by X86. Previously the support for pattern detection was added (seeTTI::isLegalAltInstr), but the cost still did not estimated properly.	2023-12-27 15:57:21 -05:00
Kazu Hirata	03dc806b12	[Transforms] Use {DenseMap,SmallPtrSet}::contains (NFC)	2023-12-22 14:51:22 -08:00
Alexey Bataev	a13148a880	[SLP]Fix PR75995: drop wrapping flags for resized wrapped binops. If decided to resize the instruction, need to drop wrapping flags from the resulting vector instructions to avoid incorrect optimizations/assumptions later. Fixes PR75995.	2023-12-20 06:51:39 -08:00
Arthur Eubanks	71a9292298	Revert "[SLP]Improve findReusedOrderedScalars processing, NFCI." This reverts commit 44dc1e0baae7c4b8a02ba06dcf396d3d452aa873. Causes non-determinism, see #75987.	2023-12-19 16:14:04 -08:00
Alexey Bataev	00edad17c2	[SLP][NFC]Check for equal opcode preliminary to meet weak strict order requirement, NFC. This change does not affect functionality, just fixes the assertions in some standard c++ library implementations.	2023-12-18 14:12:33 -08:00
Alexey Bataev	a7e10e6603	Revert "[SLP][NFC]Check for equal opcode preliminary to meet weak strict order" This reverts commit 58a2c4e2f24ffce3966c3988d1a4ca7b04c52244 to fix the issue detected by https://lab.llvm.org/buildbot/#/builders/233/builds/5424.	2023-12-18 12:35:52 -08:00
Alexey Bataev	58a2c4e2f2	[SLP][NFC]Check for equal opcode preliminary to meet weak strict order requirement, NFC. This change does not affect functionality, just fixes the assertions in some standard c++ library implementations.	2023-12-18 06:42:03 -08:00
Reid Kleckner	3e16152ebc	[SLP] Fix OOB GEP index access for a no-op GEP Issue is covered by existing test llvm/test/Transforms/SLPVectorizer/RISCV/phi-const.ll See issue #75632 for ideas for how we could catch these more easily in the future.	2023-12-15 17:33:06 +00:00
Maurice Heumann	f42b930af9	[SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438 ) SLP Vectorizer can discard vector entries at unknown positions. This example shows the behaviour: https://godbolt.org/z/or43EM594 The following instruction inserts an element at an unknown position: ``` %2 = insertelement <3 x i64> poison, i64 %value, i64 %position ``` The position depends on an argument that is unknown at compile time. After running SLP, one can see there is no more instruction present referencing `%position`. This happens as SLP parallelizes the two adds in the example. It then needs to merge the original vector with the new vector. Within `isUndefVector`, the SLP vectorizer constructs a bitmap indicating which elements of the original vector are poison values. It does this by walking the insertElement instructions. If it encounters an insert with a non-constant position, it is ignored. This will result in poison values to be used for all entries, where there are no inserts with constant positions. However, as the position is unknown, the element could be anywhere. Therefore, I think it is only safe to assume none of the entries are poison values and to simply take them all over when constructing the shuffleVector instruction. This fixes #75437	2023-12-14 09:48:23 -05:00
Alexey Bataev	44dc1e0baa	[SLP]Improve findReusedOrderedScalars processing, NFCI. Tries to simplify structural complexity of the findReusedOrderedScalars function.	2023-12-08 14:27:55 -08:00
Alexey Bataev	fb35bb48c6	[SLP][NFC]Build value-to-gather-nodes map during nodes building, NFC.	2023-12-07 13:41:19 -08:00
Alexey Bataev	58785ebd24	[SLP][NFC]Check for ephemeral values beforehand, NFC.	2023-12-07 13:25:15 -08:00
Alexey Bataev	0e1a9e3084	[SLP]Fix PR74607: Fix dependency between buildvector nodes with user nodes, having same last instruction. If the user nodes has the same last-instruction, used as insert points for the buildvector nodes, finding the proper dependency is crucial. Before, it depended on the indices of the buildvectors themselves but looks like it should depend on indices of the user nodes, because it identifies the vectorization order and, thus, properly aligns buildvector nodes in terms of def-use chain.	2023-12-06 10:15:01 -08:00
Paschalis Mpeis	7b83f69db4	[NFC] Replace CallInst with FunctionType in VFABI, VFShape API (#74569 ) Minor simplification applied to VFShape::getScalarShape, VFShape::get, and VFABI::tryDemangleForVFABI methods. Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`	2023-12-06 17:14:58 +00:00
Alexey Bataev	279b1ea65f	[SLP]Improve gathering of the scalars used in the graph. Currently we emit gathers for scalars being vectorized in the tree as a pair of extractelement/insertelement instructions. Instead we can try to find all required vectors and emit shuffle vector instructions directly, improving the code and reducing compile time. Part of non-power-of-2 vectorization. Differential Revision: https://reviews.llvm.org/D110978	2023-12-01 11:23:57 -08:00
Alexey Bataev	ba52310657	[SLP][NFC] Unify code for cost estimation/codegen for buildvector, NFC. (#73182 ) This just moves towards reusing same function for both cost estimation/codegen for buildvector.	2023-11-30 10:04:57 -05:00
Alexey Bataev	1f88e62db4	[SLP]Fix/improve minbitwidth mapping to use TreeEntry as a key. Currently, MinBWs map uses Value* as a key and stores mapping for each value to be demoted. It make is it hard to get the actual MinBWs value for the buildvector scalars(constants), since same constant might be used in different nodes with the different MinBWs values/decisions. Also, it consumes extra memory for the vectorized values/instructions from the same nodes. Better to map actual nodes. It fixes the bitwidth data fetching for buildvector scalars and improves memory consumption/analysis time for other instructions.	2023-11-30 06:33:31 -08:00
Alexey Bataev	447da954c7	[SLP][NFC]Use DenseSet instead of SetVector, NFC. For CSEBlocks we can safely use DenseSet, the order should not be preserved for this container.	2023-11-28 11:27:49 -08:00
Alexey Bataev	badec9b7bf	[SLP][NFC]Fix loops variables names, NFC.	2023-11-28 10:30:19 -08:00
Alexey Bataev	c72884225b	[SLP][NFC]Fix naming of variables/functions, NFC.	2023-11-28 09:15:38 -08:00
Alexey Bataev	b6eb740cae	[SLP][NFC]Improve/fix auto declarations, NFC.	2023-11-28 07:39:21 -08:00
Alexey Bataev	45139ab6ca	[SLP][NFC]Improve aliasing support in SLP, NFC. No need to store optional boolean in the map, enough to store boolean directly. Also, we can do preliminary check for instruction and if they are not simple, mark as aliased without storing this result in the map.	2023-11-28 07:24:44 -08:00
Alexey Bataev	e9fdb965f9	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-24 08:05:19 -08:00
Alexander Kornienko	af7a145352	Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using" This reverts commit 52df67ba76a03ad33132d1d4f4202d5a2313a3cd, which causes spurious clang crashes. See `52df67ba76 (commitcomment-133381701)`	2023-11-24 01:18:46 +01:00
Alexey Bataev	53f912480f	[SLP][NFC]Remove extra unused vars, add TODO, NFC.	2023-11-22 12:26:54 -08:00
Alexey Bataev	12bcd6339d	[SLP]Improve detection of gathered loads, if no other deps are detected. If the gather node includes ordered loads only partially (not the whole node consists of loads) and the other gathered scalar are not loads, and no other dependency from other nodes is found, we still can improve the cost of gather, if take into account the fact that these loads still can be vectorized.	2023-11-22 11:35:51 -08:00
Alexey Bataev	369c0eb55b	[SLP][NFC]Use SmallVector instead of std::vector and remove unused includes, NFC.	2023-11-22 08:11:27 -08:00
Alexey Bataev	f609d4ba1d	[SLP]Fix PR72833: do not crash if only operand is casted but the use instruction. Need to check if only operand is casted, not the user instruction itself, if the types of the operands does not match the actual type.	2023-11-20 08:35:35 -08:00
Alexey Bataev	40e46b6eff	[SLP]Do not emit int bitcast after minbitwidth analysis. No need to emit bitcat op for integer operands if it is detected that after minbitwidth analysis the type is the same.	2023-11-20 06:25:17 -08:00
Alexey Bataev	52df67ba76	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-17 13:45:28 -08:00
Arthur Eubanks	6a126e279d	Revert "[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using" This reverts commit cfd0f41f4effb5d31654dcb28c1a577c152ee23b. Causes crashes, see `cfd0f41f4e`.	2023-11-17 13:23:38 -08:00
Valery Dmitriev	94e86751e5	[NFC][SLP] Remove unnecessary DL argument (#72674 )	2023-11-17 10:08:25 -08:00
Alexey Bataev	cfd0f41f4e	[SLP][NFC]Make collectValuesToDemote member of BoUpSLP to avoid using Expr container, NFC. Saves the memory and may improve compile time.	2023-11-17 08:05:02 -08:00
Alexey Bataev	72b97630bc	[SLP][NFC]Fix comparison of integers of different signs warning, NFC.	2023-11-16 17:18:28 -08:00
Alexey Bataev	cb678708e6	[SLP][NFC]Add TreeEntry-based add member functions and use them, where possible, NFC.	2023-11-16 16:30:52 -08:00
Alexey Bataev	484a27e412	[SLP][NFC]Make needToDelay constant, NFC.	2023-11-16 16:11:43 -08:00
Alexey Bataev	009002a8cb	[SLP][NFC]Unify matching for perfect diamond match between cost and codegen models, NFC.	2023-11-16 08:11:52 -08:00
Alexey Bataev	206799fcf5	[SLP]Fix PR72524: "Out-of-bounds shuffle mask element" failed. Need to check if we ran into subvector extract pattern before checking for identity vector to avoid compiler crash.	2023-11-16 07:39:32 -08:00
Alexey Bataev	95703642e3	[SLP]Fix PR72202: wrong mask emission for the first found vector operand. Need to copy the submask not to the very first part of the common extractelements vector mask, but to the proper one to avoid wrong code emission.	2023-11-16 07:01:05 -08:00

1 2 3 4 5 ...

1572 Commits