llvm-project

Author	SHA1	Message	Date
Alexey Bataev	662efdee9b	[SLP][NFC]Improve handling of MinBWs container, NFC. Replaced by DenseMap instead of MapVector(the order is not important, just lookup is used) + reduced number of lookups.	2023-07-31 07:26:55 -07:00
Alexey Bataev	85635c7f60	[SLP][NFC]Use ScalarTy consistently in getEntryCost, NFC.	2023-07-31 06:52:56 -07:00
Florian Hahn	822c749aec	[LV] Shrink operands before creating new instr to force eval order. Shrink operands before creating the new instruction to make sure the same evaluation order is used on all platforms. This fixes buildbot failures due to different argument evaluation order on different systems.	2023-07-30 17:16:37 +01:00
Alexey Bataev	48bc5b0a29	[SLP][PR64099]Fix unsound undef to poison transformation when handling insertelement instructions. If the original vector has undef, not poison values, which are not rewritten by later insertelement instructions, need to transform shuffle with the undef vector, not a poison vector, and actual indices, not PoisonMaskElem, otherwise the transformation may produce more poisons output than the input.	2023-07-27 16:09:49 -07:00
Martin Storsjö	245ec675a4	Revert "[LV] Re-use existing broadcast value for live-ins." This reverts commit eea9258648ce73507f6f85c395de978af659d498. That commit triggered crashes in the following testcase: $ cat reduced.c typedef struct { int a[8] } b; typedef struct { b c; short d } e; void f() { int g; char h; e i = f; short j = i->d; int a = i->c->a[0]; for (;;) for (; g < a; g++) { h = j * i->d >> 8; h++; } } $ clang -target aarch64-linux-gnu -w -c -O2 reduced.c	2023-07-25 10:35:41 +03:00
Fangrui Song	e8e7a959c7	[SLP] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D154891	2023-07-24 09:47:50 -07:00
Alexey Bataev	44eca64224	[SLP]Check scalars before trying scheduling. Need to check the scalars if they can be vectorized before trying to schedule them. It may save compile time and improve vectorization on large functions/basic blocks. Differential Revision: https://reviews.llvm.org/D154891	2023-07-24 09:25:19 -07:00
Florian Hahn	eea9258648	[LV] Re-use existing broadcast value for live-ins. When requesting a vector value for a live-in, we can re-use the broadcast of the live-in of part 0 for parts > 0.	2023-07-24 11:50:47 +01:00
Florian Hahn	25d34215bb	[LV] Replace use of getMaxSafeDepDist with isSafeForAnyVector (NFC) Replace the use of getMaxSafeDepDistBytes with the more direct isSafeForAnyVector. This removes the need to define getMaxSafeDepDistBytes.	2023-07-21 22:05:50 +02:00
David Berard	8fa02db8cf	[llvm][SLP] Exit early if inputs to comparator are equal TL;DR: This PR modifies a comparator. The comparator is used in a subsequent call to llvm::stable_sort. Sorting comparators should follow strict weak ordering - in particular, (x < x) should return false. This PR adds a fix to avoid an infinite loop when the inputs to the comparator are equal. Details: Sometimes when two equivalent tensors passed into the comparator, we encounter infinite looping (at `aae2eaae2c/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (L4049)`) Although it seems like this comparator will never be called with two equivalent pointers, some sanitizers, e.g. https://chromium.googlesource.com/chromiumos/third_party/gcc/+/refs/heads/stabilize-zako-5712.88.B/libstdc++-v3/include/bits/stl_algo.h#360, will add checks for (x < x). When this sanitizer is used with the current implementation, it triggers a comparator check for (x < x) which runs into the infinite loop Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D155874	2023-07-21 05:40:55 -07:00
Alexey Bataev	aae2eaae2c	[SLP]Fix a crash when trying to cast scalable vector type to fixed. Need to check for FixedVectorType, not a vector type, since later compiler performs unconditional cast to FixedVectorType and gets the number of elements in this type.	2023-07-19 11:53:49 -07:00
Nuno Lopes	d75fb17963	[VectorCombine] Use poison insteaf of undef as placeholder [NFC] These vector lanes are never accessed. They are used for shifting a value into the right lane and therefore only 1 value of the whole vector is actually used	2023-07-19 10:29:08 +01:00
Alexey Bataev	4bbf37199c	[SLP][NFC]Improve compile-time by using map {TreeEntry , Instruction } in getLastInstructionInBundle(), NFC. Instead of building EntryToLastInstruction before the vectorization, build it automatically during the calls to getLastInstructionInBundle() function.	2023-07-18 13:24:55 -07:00
Alexey Bataev	83ba148a8a	[SLP]Include cost of the reshuffling for same nodes with resizing. Need to account reshuffling, required for the reused elements in the buildvector nodes, which are copies (perfect match) of other nodes, but include reused elements. Differential Revision: https://reviews.llvm.org/D149966	2023-07-18 06:05:15 -07:00
Florian Hahn	68746a8cea	[LV] Move all VPlan transforms after initial VPlan construction. Reorder VPlan transforms slightly so they are all grouped together, after disabling Value -> VPValue lookup. In terms of codegen impact, this should be NFC modulo a small number of instruction reorderings. Preparation to split up tryToBuildVPlanWithVPRecipes in a follow-up. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154640	2023-07-18 10:53:30 +01:00
Arthur Eubanks	1cb3fbc713	Revert "[SLP][NFC]Improve compile-time by using map {TreeEntry , Instruction }" This reverts commit 0d21b7cbdeb2f2eb5ef123a15099da0b651b24c0. Causes broken IR, test case provided at https://reviews.llvm.org/rG0d21b7cbdeb2f2eb5ef123a15099da0b651b24c0	2023-07-17 14:54:47 -07:00
Alexey Bataev	d8d4c99685	[SLP][NFC]Improve performance of isGatherShuffledEntry() function, NFC. Transformed if checks to asserts and simplified some more code to improve compile time.	2023-07-17 14:08:56 -07:00
Alexey Bataev	0d21b7cbde	[SLP][NFC]Improve compile-time by using map {TreeEntry , Instruction } in getLastInstructionInBundle(), NFC. Instead of building EntryToLastInstruction before the vectorization, build it automatically during the calls to getLastInstructionInBundle() function.	2023-07-17 11:36:21 -07:00
Alexey Bataev	8ab962e411	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 07:19:49 -07:00
Alexey Bataev	bc8abb42bb	Revert "[SLP]Relax assertion to check if the input scalars were extended to" This reverts commit 6fdfc81287ecdc2a7f409d08538ec6ce2bd698da to fix the check in the assert )need to use end, nod begin function).	2023-07-14 07:04:06 -07:00
Alexey Bataev	6fdfc81287	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 06:48:25 -07:00
Anna Thomas	1159266734	[SLP] Add support for fmaximum/fminimum reduction This patch adds support for vectorized reduction of maximum/minimum intrinsics which are under the appropriate reduction kind. Differential Revision: https://reviews.llvm.org/D154463	2023-07-12 15:22:38 -04:00
Nikita Popov	94abecca6b	[IVDescriptors] Remove typed pointer support (NFC) This also removes the element type from the descriptor, as it is always i8. The meaning of the step is now the same between integers and pointers.	2023-07-12 15:48:29 +02:00
Florian Hahn	9259f41e62	[VPlan] Clear reduction flags directly as VPlanTransform. After D150027, all relevant recipes should model their IR flags directly. Instead of removing the flags after codegen as part of fixReductions, drop poison generating flags directly from the recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150028	2023-07-09 21:11:51 +01:00
Florian Hahn	14ec3f4b06	[LV] Skip VFs > # iterations remaining for epilogue vectorization. If a candidate VF for epilogue vectorization is greater than the number of remaining iterations, the epilogue loop would be dead. Skip such factors. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154264	2023-07-07 21:43:51 +01:00
Florian Hahn	aee851fd0e	Revert "[LV] Skip VFs < iterations remaining for epilogue vectorization." This reverts commit 7cc0be01a0068946ea3613dc2cb45c81b0f45860. The title of the commit is incorrect, revert to fix the commit message.	2023-07-07 21:41:24 +01:00
Florian Hahn	7cc0be01a0	[LV] Skip VFs < iterations remaining for epilogue vectorization. If a candidate VF for epilogue vectorization is less than the number of remaining iterations, the epilogue loop would be dead. Skip such factors. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154264	2023-07-07 20:33:42 +01:00
Florian Hahn	a0fcf84a8c	[LV] Consider if scalar epilogue is required in getMaximizedVFForTarget. When a scalar epilogue is required, at least one iteration of the scalar loop has to execute. Adjust ConstTripCount accordingly to avoid picking a max VF that results in a dead vector loop. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154261	2023-07-06 13:35:35 +01:00
Florian Hahn	2265bb064b	[LV] Update generateInstruction to return produced value (NFC). Update generateInstruction to return the produced value instead of setting it for each opcode. This reduces the amount of duplicated code and is a preparation for D153696. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154240	2023-07-05 19:53:59 +01:00
Florian Hahn	1746ac42ca	[LV] Forget SCEVs for exit phis after vectorization. After vectorization, the exit blocks of the original loop will have additional predecessors. Invalidate SCEVs for the exit phis in case SE looked through single-entry phis. Fixes https://github.com/llvm/llvm-project/issues/63368 Fixes https://github.com/llvm/llvm-project/issues/63669	2023-07-04 21:28:03 +01:00
David Green	12025cef3e	[CostModel] Use min/max intrinsics for vecreduce.min/max costs This changes the costmodelling of the vecreduce.min/max nodes to use the costs of the relevant min/max intrinsics instead of expanding them to compare and selects. The getMinMaxReductionCost have changed to take a Opcode for the relevant intrinsic, dropping the IsUnsigned and CondTy parameters as they are no longer needed. A follow up patch will add some basic fminimum/fmaximum costmodelling. Differential Revision: https://reviews.llvm.org/D153547	2023-07-04 15:02:30 +01:00
Florian Hahn	39385c521d	[LV] Move getBroadcastInstr to VPTransformState.::get (NFCI). getBroadcastInstrs is only used in VPTransformState::get. Move it closer to use to reduce unnecessary interaction with ILV object.	2023-07-04 11:24:11 +01:00
Evgeniy Brevnov	d7329653d0	[VPlan] Allow sinking of instructions with no defs We started seeing new failure after D142886. Looks like it enabled new cases and we hit an assert: assert(Current->getNumDefinedValues() == 1 && "only recipes with a single defined value expected"); When we do instruction sinking for the first order recurrence we hit an assert if instruction doesn't have single def. In case instruction doesn't produce any new def there is no new users and nothing to sink. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D151204	2023-07-04 16:53:06 +07:00
Florian Hahn	b4efc0f070	[LV] Break up condition in selectEpilogueVectorizationFactor loop (NFCI) Restructure the loop as suggested in D154264 to increase readability and make it easier to extend.	2023-07-03 22:39:40 +01:00
Florian Hahn	55e7f1f786	[LV] Pass bool to requiresScalarEpilogue (NFC). requiresScalarEpilogue only checks if the selected VF is vectorizing (and not scalar). Update it to just take a boolean, to make it clearer what information is used and to allow callers without a VF (used in a follow-up patch).	2023-06-30 22:08:27 +01:00
Valery N Dmitriev	03b118c7e4	[SLP] Fix crash on attempt to access on invalid iterator state. The patch fixes corner case when no of scalar instructions required scheduling for vectorized node. Differential Revision: https://reviews.llvm.org/D154175	2023-06-30 11:40:25 -07:00
Nikita Popov	cc31d787c3	Revert "Reland [SLP] Provide an universal interface for FixedVectorType::get. NFC." This reverts commit 19b1d3bd7eeecbeb1e45045960faf325c7bc5c64. Both the commit and the review are missing a patch description.	2023-06-30 11:31:16 +02:00
Han-Kuan Chen	19b1d3bd7e	Reland [SLP] Provide an universal interface for FixedVectorType::get. NFC. Differential Revision: https://reviews.llvm.org/D154114	2023-06-29 23:15:52 -07:00
Arthur Eubanks	a374fb2b5e	Revert "[SLP] Provide an universal interface for FixedVectorType::get. NFC." This reverts commit fcd58ea50c218b61a58d6815b9d15bad7dbc75a3. Causes crashes, see comments on D154114.	2023-06-29 21:49:05 -07:00
Han-Kuan Chen	fcd58ea50c	[SLP] Provide an universal interface for FixedVectorType::get. NFC. Differential Revision: https://reviews.llvm.org/D154114	2023-06-29 17:06:08 -07:00
Igor Kirillov	17bde328d6	[LV] Add mask support for vectorizing interleaved groups This patch extends LoopVectorize to handle the vectorization of interleaved memory accesses with scalable vectors when mask is required or/and predicated tail folding is enabled. Differential Revision: https://reviews.llvm.org/D152258	2023-06-29 17:50:56 +00:00
Luke Lau	d0d864f6f4	[SLP] Explicitly pass AccessTy to getGEPCost Building on D149889, this patch updates SLP to pass the vector type as the AccessTy to getGEPCost. This should have the effect of GEPs being costed for more often instead of being treated as foldable into the address mode and thus free, as some architectures, notably RISC-V, do not have offset+reg addressing modes for vector memory accesses. Note that in SLP, GEPs are costed in two places: getPointersChainCost and GetGEPCostDiff. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D153570	2023-06-29 18:42:24 +01:00
Luke Lau	a68dcd09e8	[TTI] Use users of GEP to guess access type in getGEPCost Currently getGEPCost uses the target type of the GEP as a heuristic for the type that will be accessed, to pass onto isLegalAddressingMode. Targets use this to work out if a GEP can then be folded into the load/store instruction that uses the GEP. For example, on RISC-V loads and stores can have an offset added to a base register folded into a single instruction, so the following GEP is free: %p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0 %x = load i32, ptr %p ; getInstructionCost = 1 ------------------------------------------------------------------------ lw t0, a0(42) However vector loads and stores cannot have an offset folded into them, so the following GEP is costed: %p = getelementptr <2 x i32>, ptr %base, i32 42 ; getInstructionCost = 1 %x = load <2 x i32>, ptr %p ; getInstructionCost = 1 ------------------------------------------------------------------------ addi a0, 42 vle32 v8, (a0) The issue arises whenever there is a mismatch between the target type of the GEP and the type that is actually accessed: %p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0 %x = load <2 x i32>, ptr %p ; getInstructionCost = 1 ------------------------------------------------------------------------ addi a0, 42 vle32 v8, (a0) Even though this GEP will result in an add instruction, because TTI thinks it's loading an i32, it will think it can be folded and not charge for it. The target type can become mismatched with the memory access during transformations, noticeably during SLP where a scalar base pointer will be reused to perform a vector load or store. This patch adds an optional AccessType argument to getGEPCost which allows the type of memory accessed by users to be passed in as a hint, so that we can more accurately determine if the GEP can be folded into its users. If AccessType is not provided, getGEPCost falls back to the old behaviour of using the PointeeType to guess the memory access type. This can be revisited in a later patch. Also for now, only GEPs with exactly one user use the access type hint. Whilst we could look through all users and use all access types to determine if we can fold the GEP, this patch avoids doing so to prevent O(N) behaviour. Differential Revision: https://reviews.llvm.org/D149889	2023-06-29 13:44:37 +01:00
Alexey Bataev	5d2cc8e242	[SLP]Fix emission of buildvectors with full match. If the buildvector node is a full match of another node, need to correctly build the mask for the original vector value and build common mask for the emitted node.	2023-06-28 13:47:08 -07:00
David Green	9c7aab362a	[SLP] Use vector types for cmp alt instructions costs Similar to the other code that costs main/alt instructions, the cmp should be using the VecTy for the costs, not the ScalarTy. One of the tests look like it gets worse just because it is not simplified to 0. Differential Revision: https://reviews.llvm.org/D153507	2023-06-28 21:02:29 +01:00
Youngsuk Kim	243f0566dc	[llvm] Replace uses of Type::getPointerTo (NFC) Partial progress towards removing in-tree uses of `Type::getPointerTo`, before we can deprecate the API. If the API is used solely to support an unnecessary bitcast, get rid of the bitcast as well. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153933	2023-06-28 09:21:34 -04:00
Alexey Bataev	a8f1a3e025	[SLP]Fix PR63141: compareCmp is not strict weak ordering. Added some extra checks for comapreCMP function if IsCompatibility is false to make it meat the strict weak ordering requirements to be correctly used in sort functions.	2023-06-28 06:00:31 -07:00
Alexey Bataev	9d4fbcd5ff	Revert "[SLP]Fix PR63141: compareCmp is not strict weak ordering." This reverts commit f3ebd88064d7f1c36a8272b3e5f7d53501c3f53b to pacify windows-based buildbots.	2023-06-28 04:37:27 -07:00
Alexey Bataev	f3ebd88064	[SLP]Fix PR63141: compareCmp is not strict weak ordering. Added some extra checks for comapreCMP function if IsCompatibility is false to make it meat the strict weak ordering requirements to be correctly used in sort functions.	2023-06-27 14:31:55 -07:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00

1 2 3 4 5 ...

3873 Commits