llvm-project

Author	SHA1	Message	Date
Alexey Bataev	8ab962e411	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 07:19:49 -07:00
Alexey Bataev	bc8abb42bb	Revert "[SLP]Relax assertion to check if the input scalars were extended to" This reverts commit 6fdfc81287ecdc2a7f409d08538ec6ce2bd698da to fix the check in the assert )need to use end, nod begin function).	2023-07-14 07:04:06 -07:00
Alexey Bataev	6fdfc81287	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 06:48:25 -07:00
Alexey Bataev	ec6b40ab9b	[SLP]Add a test with the stores with long distances between them, NFC.	2023-07-13 15:14:09 -07:00
Anna Thomas	1159266734	[SLP] Add support for fmaximum/fminimum reduction This patch adds support for vectorized reduction of maximum/minimum intrinsics which are under the appropriate reduction kind. Differential Revision: https://reviews.llvm.org/D154463	2023-07-12 15:22:38 -04:00
Anna Thomas	a43aebcd91	[SLP] Test for minimum/maximum reduction minimum/maximum tests from D154463. This contains tests where we vectorize minimum/maximum as well as the tests where we currently do not identify reduction patterns. Differential Revision: https://reviews.llvm.org/D155096	2023-07-12 15:22:37 -04:00
Nikita Popov	edb2fc6dab	[llvm] Remove explicit -opaque-pointers flag from tests (NFC) Opaque pointers mode is enabled by default, no need to explicitly enable it.	2023-07-12 14:35:55 +02:00
Valery N Dmitriev	03b118c7e4	[SLP] Fix crash on attempt to access on invalid iterator state. The patch fixes corner case when no of scalar instructions required scheduling for vectorized node. Differential Revision: https://reviews.llvm.org/D154175	2023-06-30 11:40:25 -07:00
Alexey Bataev	bb4e547a60	[SLP][NFC]Add a test for buildvector with reused scalars and extractelements.	2023-06-29 11:52:12 -07:00
Luke Lau	d0d864f6f4	[SLP] Explicitly pass AccessTy to getGEPCost Building on D149889, this patch updates SLP to pass the vector type as the AccessTy to getGEPCost. This should have the effect of GEPs being costed for more often instead of being treated as foldable into the address mode and thus free, as some architectures, notably RISC-V, do not have offset+reg addressing modes for vector memory accesses. Note that in SLP, GEPs are costed in two places: getPointersChainCost and GetGEPCostDiff. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D153570	2023-06-29 18:42:24 +01:00
Luke Lau	2b28f8f044	[RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149888	2023-06-29 18:42:22 +01:00
Luke Lau	b87a09301f	[RISCV] Add tests for cost modelling constants in phis Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149168	2023-06-29 13:55:22 +01:00
David Green	46ef3337ea	[AArch64] Add and cmp cost model tests. NFC See D153611. Tests for the cost of icmp(and, 0) are added, in addition to expanding the extractelements-to-shuffle.ll test, which has always been a bit simple, to include a more complete example with both a vector and scalar version. The icmp(and, 0) costs are targetting at improving the second when the cost of vector inserts and extracts is lowered.	2023-06-29 13:29:34 +01:00
Alexey Bataev	5d2cc8e242	[SLP]Fix emission of buildvectors with full match. If the buildvector node is a full match of another node, need to correctly build the mask for the original vector value and build common mask for the emitted node.	2023-06-28 13:47:08 -07:00
David Green	9c7aab362a	[SLP] Use vector types for cmp alt instructions costs Similar to the other code that costs main/alt instructions, the cmp should be using the VecTy for the costs, not the ScalarTy. One of the tests look like it gets worse just because it is not simplified to 0. Differential Revision: https://reviews.llvm.org/D153507	2023-06-28 21:02:29 +01:00
David Green	e165bc2631	[SLP][AArch64] Extend extracts-from-scalarizable-vector.ll test for cmp cost testing. NFC See D153507. The existing test is over-simplified, as written it should have been simpified prior to SLP vectorization. I have left it as-is to ensure the crash it was protecting against doesn't arise again. A new test with valid inputs is also added to show the incorrect costs of alt cmp vectorization.	2023-06-28 17:16:34 +01:00
Alexey Bataev	a8f1a3e025	[SLP]Fix PR63141: compareCmp is not strict weak ordering. Added some extra checks for comapreCMP function if IsCompatibility is false to make it meat the strict weak ordering requirements to be correctly used in sort functions.	2023-06-28 06:00:31 -07:00
Alexey Bataev	9d4fbcd5ff	Revert "[SLP]Fix PR63141: compareCmp is not strict weak ordering." This reverts commit f3ebd88064d7f1c36a8272b3e5f7d53501c3f53b to pacify windows-based buildbots.	2023-06-28 04:37:27 -07:00
Fangrui Song	d39b4ce3ce	[test] Replace aarch64-*-eabi with aarch64 Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver. We want to avoid it elsewhere as well. Just use the common "aarch64" without other triple components.	2023-06-27 20:02:52 -07:00
Fangrui Song	ebbfdca586	[test] Replace aarch64-arm-none-eabi with aarch64 Similar to 02e9441d6ca73314afa1973a234dce1e390da1da, but for llvm/test and one lld/test/ELF test.	2023-06-27 19:36:27 -07:00
Alexey Bataev	f3ebd88064	[SLP]Fix PR63141: compareCmp is not strict weak ordering. Added some extra checks for comapreCMP function if IsCompatibility is false to make it meat the strict weak ordering requirements to be correctly used in sort functions.	2023-06-27 14:31:55 -07:00
Alexey Bataev	1f3d23845f	[SLP][NFC]Add a test for vectorization of cmps with alternate predicates, NFC.	2023-06-27 13:57:51 -07:00
Philip Reames	7f26c27e03	[RISCV] Enable SLP by default (when vectors are available) I propose that we go ahead and enabled SLP by default. Over the last few weeks, @luke and I have been working through codegen issues seen at small VLs from a couple of SPEC workloads. We still have a ways to go to get optimal codegen, but we're at the point where having a single configuration we're all tuning against is probably the right default. As a bit of history, I introduced this TTI hook back in a310637132 back in August of last year to unblock enabling LoopVectorizer. At the time, we had a couple known issues: constant materialization, address generation, and a general lack of maturity of small fixed vector codegen. By now, each of these has had significant investment. I can't say any of them are completely fixed, but we're no longer seeing instances of them every place we look. What we're mostly seeing at this point is a long tail of code gen opportunities, many involving build vectors, shuffles, and extract patterns. I have a couple patches up to continue iterating on those issues, but I don't think they need to be blockers for enabling SLP. Differential Revision: https://reviews.llvm.org/D152750	2023-06-14 09:49:58 -07:00
Mikael Holmen	ac9b9e3aad	[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize We don't want the existence of debug instructions affect codegen so we now ignore debug instructions and other "isAssumeLikeIntrinsics in the "extend schedule region" search loop in BoUpSLP::BlockScheduling::extendSchedulingRegion. Differential Revision: https://reviews.llvm.org/D152441	2023-06-14 13:00:15 +02:00
Simon Pilgrim	595a74391d	[CostModel][X86] Tweak SSE2 v2i64 multiply costs based off D46276 script It looks like we were trying to account for SLM costs, which are actually handled separately Fixes #62969	2023-06-14 11:06:15 +01:00
Vasileios Porpodas	9d5466849a	[SLP][NFC] Precommit test that exposes a bug in ShuffleBuilder. ShuffleBuilder generates a zero mask here: `[[TMP6:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> poison, <4 x i32> zeroinitializer` But the correct mask is `0,0,1,1`, or we should have reused `TMP4`. Differential Revision: https://reviews.llvm.org/D152868	2023-06-13 16:52:36 -07:00
Simon Pilgrim	c1f81ac2c1	[SLP][X86] Add test coverage for Issue #62969	2023-06-13 16:36:13 +01:00
Mikael Holmen	6afe9f3985	[test][SLPVectorizer] Precommit testcase showing debug info affects codegen Differential Revision: https://reviews.llvm.org/D152705	2023-06-13 10:46:32 +02:00
Bjorn Pettersson	8813c1087e	[SLP] Update X86/schedule_budget.ll test case to show budget impact The comments and the checks in test/Transforms/SLPVectorizer/X86/schedule_budget.ll did not match. After commit 352c46e70716061e99 the vectorization has happened also with the reduced budget. This patch is supposed to restore the original intention with the test case (the one described in the comments). We want to see that a restricted budget may reduce the amount of vectorization (i.e. verifying that the -slp-schedule-budget option makes a difference), while a higher budget still result in vectorization. Differential Revision: https://reviews.llvm.org/D152530	2023-06-09 15:33:35 +02:00
Alexey Bataev	95b631181a	[SLP]Fix getSpillCost functions. There are several issues in the current implementation. The instructions are not properly ordered, if they are placed in different basic blocks, need to reverse the order of blocks. Also, need to exclude non-vectorizable nodes and check for CallBase, not CallInst, otherwise invoke calls are not handled correctly.	2023-05-26 12:19:28 -07:00
Alexey Bataev	e892193cc8	[SLP][NFC]Add a test for spill cost, NFC.	2023-05-26 11:04:46 -07:00
Alexey Bataev	ae5ff3ca0c	[SLP]Fix PR62665: compiler crash when trying to access non-existing mask element. Need to check at first if the SubMask element is PoisonMaskElem to avoid compiler crash.	2023-05-22 13:43:25 -07:00
Luke Lau	c27a0b21c5	[SLP][RISCV] Account for offset folding in getPointersChainCost For a GEP in a pointer chain, if: 1) a pointer chain is unit-strided 2) the base pointer wasn't folded and is sitting in a register somewhere 3) the distance between the GEP and the base pointer is small enough and can be folded into the addressing mode of the using load/store Then we can exclude that GEP from the total cost of the pointer chain, as it will likely be folded away. In order to check if 3) holds, we need to know the type of memory access being made by the users of the pointer chain. For that, we need to pass along a new argument to getPointersChainCost. (Using the source pointer type of the GEP isn't accurate, see https://reviews.llvm.org/D149889 for more details). Also note that 2) is currently an assumption, and could be modelled more accurately. This prevents some unprofitable cases from being SLP vectorized on RISC-V by making the scalar costs cheaper and closer to the actual codegen. For now the getPointersChainCost hook is duplicated for RISC-V to prevent disturbing other targets, but could be merged back in and shared with other targets in a following patch. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149654	2023-05-22 13:55:30 +01:00
Luke Lau	53afdb712d	[SLP][RISCV] Add test for folding offsets in GEP pointer chains	2023-05-22 10:11:02 +01:00
Luke Lau	8288d39b4c	[RISCV] Add test for unprofitable SLP vectorization Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149653	2023-05-19 14:45:39 +01:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
Alexey Bataev	9a7248f561	[SLP]Fix crash for scalarized vectors. Need to remove insertion of the nodes to the InVector in case of scalarized vectors too to avoid compiler crashes.	2023-05-17 06:32:22 -07:00
ManuelJBrito	e335e8a432	[InstCombine] Update instcombine for vectorOps to use new shufflevector semantics This patch updates the transformations in InstCombineVectorOps to use the new hufflevector semantics that say that undefined values in the mask yield poison. To prevent miscompilations we have to match with m_Poison instead of m_Undef. Otherwise, we might introduce poison where there was previously undef. Differential Revision: https://reviews.llvm.org/D150039	2023-05-17 07:56:45 +01:00
Alexey Bataev	b33b000ac8	[SLP][NFC]Add remark output to the test with the perfect diamond match in vectorbuild nodes, NFC.	2023-05-05 08:19:54 -07:00
Alexey Bataev	c0e5e7db9a	[SLP]Fix a crash trying finding insert point for GEP nodes with non-gep insts. If the vectorizable GEP node is built, which should not be scheduled, and at least one node is a non-gep instruction, need to insert the vectorized instructions before the last instruction in the list, not before the first one, otherwise the instructions may be emitted in the wrong order.	2023-05-04 09:43:37 -07:00
Krzysztof Drewniak	f0415f2a45	Re-land "[AMDGPU] Define data layout entries for buffers"" Re-land D145441 with data layout upgrade code fixed to not break OpenMP. This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27. Differential Revision: https://reviews.llvm.org/D149776	2023-05-03 19:43:56 +00:00
Krzysztof Drewniak	3f2fbe92d0	Revert "[AMDGPU] Define data layout entries for buffers" This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850. Differential Revision: https://reviews.llvm.org/D149758	2023-05-03 16:11:00 +00:00
Krzysztof Drewniak	f9c1ede254	[AMDGPU] Define data layout entries for buffers Per discussion at https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798, we define two new address spaces for AMDGCN targets. The first is address space 7, a non-integral address space (which was already in the data layout) that has 160-bit pointers (which are 256-bit aligned) and uses a 32-bit offset. These pointers combine a 128-bit buffer descriptor and a 32-bit offset, and will be usable with normal LLVM operations (load, store, GEP). However, they will be rewritten out of existence before code generation. The second of these is address space 8, the address space for "buffer resources". These will be used to represent the resource arguments to buffer instructions, and new buffer intrinsics will be defined that take them instead of <4 x i32> as resource arguments. ptr addrspace(8). These pointers are 128-bits long (with the same alignment). They must not be used as the arguments to getelementptr or otherwise used in address computations, since they can have arbitrarily complex inherent addressing semantics that can't be represented in LLVM. Even though, like their address space 7 cousins, these pointers have deterministic ptrtoint/inttoptr semantics, they are defined to be non-integral in order to prevent optimizations that rely on pointers being a [0, [addr_max]] value from applying to them. Future work includes: - Defining new buffer intrinsics that take ptr addrspace(8) resources. - A late rewrite to turn address space 7 operations into buffer intrinsics and offset computations. This commit also updates the "fallback address space" for buffer intrinsics to the buffer resource, and updates the alias analysis table. Depends on D143437 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D145441	2023-05-03 15:25:58 +00:00
Alexey Bataev	f305cafc58	[SLP][NFC]Add a test with the reshuffled nodes in buildvector nodes, NFC.	2023-05-02 13:51:45 -07:00
Alexey Bataev	cf792f664a	[SLP]Fix a crash for the replaced vectorized value. If two nodes share the same value, which is replaced in one of the nodes, need to automatically replace same value in all nodes. Btter to use WeakTrackingVH for this to fix compiler crash.	2023-04-27 09:32:00 -07:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
Alexey Bataev	b1abc2beaf	[SLP]Fix PR58616: assert for gep nodes with different basic blocks. Need to relax the assertion check in the FindFirstInst lambda for GEP nodes with non-GEP instruction to avoid compiler crash.	2023-04-24 07:41:06 -07:00
Jay Foad	593e25ffae	[Vectorize] Fix vectorization, scalarization and folding of llvm.is.fpclass llvm.is.fpclass is different from other vectorizable intrinsics in that it is overloaded on an argument type, not on the return type. Differential Revision: https://reviews.llvm.org/D148905	2023-04-24 13:42:08 +01:00
Jay Foad	3237497d01	[Vectorize] Pre-commit tests for D148905 Differential Revision: https://reviews.llvm.org/D149050	2023-04-24 13:42:08 +01:00
Simon Pilgrim	aca5f9aeea	[CostModel][X86] getMemoryOpCost - increase cost of sub-32-bit vector load/stores For 8-bit/16-bit vector loads/stores we scalarize and transfer to/from the vector unit, or use the (usually slow) PINSR/PEXTR instructions. Fixes #59867	2023-04-23 21:48:25 +01:00

1 2 3 4 5 ...

1419 Commits