llvm-project

Author	SHA1	Message	Date
Alexey Bataev	9400270449	[SLP]Fix comparator for vector operands of extractelements in PHICompare Need to make comparator to follow strict-weak ordering to fix compiler crashes. Fixes #138178	2025-05-01 14:28:20 -07:00
Alexander Richardson	ee13638362	[AMDGPU] Remove explicit datalayout from tests where not needed Since e39f6c1844fab59c638d8059a6cf139adb42279a opt will infer the correct datalayout when given a triple. Avoid explicitly specifying it in tests that depend on the AMDGPU target being present to avoid the string becoming out of sync with the TargetInfo value. Only tests with REQUIRES: amdgpu-registered-target or a local lit.cfg were updated to ensure that tests for non-target-specific passes that happen to use the AMDGPU layout still pass when building with a limited set of targets. Reviewed By: shiltian, arsenm Pull Request: https://github.com/llvm/llvm-project/pull/137921	2025-04-30 10:58:17 -07:00
Jonas Paulsson	f5c8c1eedb	[SLPVectorizer] Move X86 specific handling into X86TTIImpl. (#137830 ) `ad9909d "[SLP]Fix perfect diamond match with extractelements in scalars" ` changed SLPVectorizer getScalarizationOverhead() to call TTI.getVectorInstrCost() instead of TTI.getScalarizationOverhead() in some cases. This was due to X86 specific handlings in these (overridden) methods, and unfortunately the general preference of TTI.getScalarizationOverhead() was dropped. If VL is available it should always be preferred to use getScalarizationOverhead(), and this is indeed the case for SystemZ which has a special insertion instruction that can insert two GPR64s. Then ` 33af951 "[SLP]Synchronize cost of gather/buildvector nodes with codegen"` reworked SLPVectorizer getGatherCost() which together with ad9909d caused the SystemZ test vec-elt-insertion.ll to fail. This patch restores the SystemZ test and reverts the change in SLPVectorizer getScalarizationOverhead() so that TTI.getScalarizationOverhead() is always called again. The ForPoisonSrc argument is now passed on to the TTI method so that X86 can handle this as required. Fixes: #135346	2025-04-30 17:11:27 +02:00
Florian Hahn	ec1016f7ef	[IVDescriptors] Support reductions with minimumnum/maximumnum. (#137335 ) Add a new reduction recurrence kind for reductions with minimumnum/maximumnum. Such reductions can be vectorized without nsz/nnans, same as reductions with maximum/minimum intrinsics. Note that a new reduction kind is needed to make sure partial reductions are also combined with minimumnum/maximumnum. Note that the final reduction to a scalar value is performed with vector.reduce.fmin/fmax. This should be fine, as the results of the partial reductions with maximumnum/minimumnum silences any sNaNs. In-loop and reductions in SLP are not supported yet, as there's no reduction version of maximumnum/minimumnum yet and fmax may be incorrect. PR: https://github.com/llvm/llvm-project/pull/137335	2025-04-28 11:16:36 +01:00
YunQiang Su	e9a34e4236	[RISCV] Support vectorizing FMINIMUMNUM and FMAXIMUMNUM (#135727 ) RISC-V V extension support vfmax and vfmin, which follow IEEE754-2019. We can use them directly.	2025-04-27 19:10:02 +08:00
Alexey Bataev	a7a74b349d	[SLP]Improve reordering of the alternate nodes Better to preserve the original order of the alternate nodes to avoid inter-lane shuffling, select/insert subvector patterns provide better perf. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/136329	2025-04-24 14:33:10 -04:00
Alexey Bataev	f427890a1d	[SLP]Fix PHI comparator to make it follow weak strict ordering restriction Fixes #137164	2025-04-24 11:08:17 -07:00
Philip Reames	1c722fc8f5	[RISCV][TTI] Use processShuffleMask for shuffle legalization estimate (#136191 ) We had some code which tried to estimate legalization costs for illegally typed shuffles, but it only handled the case of a widening shuffle, and used a somewhat adhoc heuristic. We can reuse the processShuffleMask utility (which we already use for individual vector register splitting when exact VLEN is known) to perform the same splitting given the legal vector type as the unit of split instead. This makes the costing both simpler and more robust. Note that this swings costs for illegal shuffles pretty wildly as we were previously sometimes hitting the adhoc code, and sometimes falling through into generic scalarization costing. I don't know that any of the costs for the individual tests in tree are significant, but the test which which triggered me finding this was reported to me by Alexey reduced from something triggering a bad choice in SLP for x264. So this has the potential to be somewhat high impact.	2025-04-22 10:50:20 -07:00
Alexey Bataev	9c388f1f05	[SLP]Prefer segmented/deinterleaved loads to strided and fix codegen Need to estimate, which one is preferable, deinterleaved/segmented loads or strided. Segmented loads can be combined, improving the overall performance. Reviewers: RKSimon, hiraditya Reviewed By: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/135058	2025-04-22 12:11:01 -04:00
Alexey Bataev	0252d338fa	[SLP]Model single unique value insert + shuffle as splat + select, where profitable When we have the remaining unique scalar, that should be inserted into non-poison vector and into non-zero position: ``` %vec1 = insertelement %vec, %v, pos1 %res = shuffle %vec1, poison, <0, 1, 2,..., pos1, pos1 + 1, ..., pos1, ...> ``` better to estimate if it is profitable to model it as is or model it as: ``` %bv = insertelement poison, %v, 0 %splat = shuffle %bv, poison, <poison, ..., 0, ..., 0, ...> %res = shuffle %vec, %splat, <0, 1, 2,..., pos1 + VF, pos1 + 1, ...> ``` Reviewers: preames, hiraditya, RKSimon Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/136590	2025-04-22 11:30:29 -04:00
Alexey Bataev	fdcee2dd36	[SLP]Reorder tree, if the reorder indices are non empty Need to consider the ordering for all nodes with the specified ordering, not only loads/store/extracts. Reviewers: hiraditya, RKSimon Reviewed By: hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/136185	2025-04-18 13:37:08 -04:00
Alexey Bataev	5fe91f1b59	[SLP]Check for catchswitch block before doing the analysis of the instructions Need to skip the analysis of the catchswitch blocks to avoid a compiler crash when trying to get the first instruction in the block.	2025-04-17 09:10:15 -07:00
Alexey Bataev	1fcf78d153	[SLP]Cache data for compressed loads before codegen Need to cache and use cached data for compressed loads before codegen to avoid side-effects, caused by the earlier vectorization, which may affect the analysis.	2025-04-17 08:43:44 -07:00
Alexey Bataev	4aca20c8b6	[SLP]Pre-cache the last instruction for all entries before vectorization Need to pre-cache last instruction to avoid unexpected changes in the last instruction detection during the vectorization, caused by adding the new vector instructions, which add new uses and may affect the analysis.	2025-04-16 11:44:55 -07:00
Alexey Bataev	913dcf1aa3	[SLP]Fix type promotion for smax reduction with unsigned reduced operands Need to add an extra bit for sign info for unsigned reduced values to generate correct code.	2025-04-16 10:14:29 -07:00
Alexey Bataev	51fa6cde7d	[SLP][NFC]Add a test with missing unsigned promotion for smax reduction, NFC	2025-04-16 09:55:34 -07:00
Alexey Bataev	af28c9c65a	[SLP]Do not reorder split node operand with reuses, if not possible Need to check if the operand node of the split vectorize node has reuses and check if it is possible to build the order for this node to reorder it correctly. Fixes #135912	2025-04-16 06:23:44 -07:00
Alexey Bataev	ddb1267430	[SLP]Insert vector instruction after landingpad If the node must be emitted in the landingpad block, need to insert the instructions after the landingpad instruction to avoid a crash. Fixes #135781	2025-04-15 13:57:53 -07:00
Alexey Bataev	85eb44e304	[SLP]Fix number of operands for the split node FOr the split node number of operands should be requested via getNumOperands() function, even if the main op is CallInst.	2025-04-15 13:33:36 -07:00
Alexey Bataev	2271f0bebd	[SLP]Check for perfect/shuffled match for the split node If the potential split node is a perfect/shuffled match of another split node, need to skip creation of the another split node with the same scalars, it should be a buildvector. Fixes #135800	2025-04-15 13:17:46 -07:00
Han-Kuan Chen	d41e517748	[SLP] Make getSameOpcode support interchangeable instructions. (#135797 ) We use the term "interchangeable instructions" to refer to different operators that have the same meaning (e.g., `add x, 0` is equivalent to `mul x, 1`). Non-constant values are not supported, as they may incur high costs with little benefit. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2025-04-16 00:08:59 +08:00
Han-Kuan Chen	bcfc9f4529	[SLP][REVEC] VectorValuesAndScales should be supported by REVEC. (#135762 ) We should align REVEC with the SLP algorithm as closely as possible. For example, by applying REVEC-specific handling when calling IRBuilder's Create methods, performing cost analysis via TTI, and expanding shuffle masks using transformScalarShuffleIndicesToVector. reference commit: 3b18d47ecbaba4e519ebf0d1bc134a404a56a9da	2025-04-15 23:03:55 +08:00
Alexey Bataev	57025b42c4	[SLP]Mark smin reduction as signed compare Reduction signed min must be marked as signed compare, fixing the analysis for the cases, where the incoming arguments are unsigned. Fixes #133943	2025-04-15 07:24:17 -07:00
Alexey Bataev	7f2587a239	[SLP][NFC]Add a test with missing zext on signed minimum reduction, NFC	2025-04-15 07:14:36 -07:00
Han-Kuan Chen	e1382b3b45	Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#133888 )" This reverts commit 123993fd974629ca0a094918db4c21ad1c2624d0.	2025-04-15 06:02:42 -07:00
YunQiang Su	fe9e2090be	Vectorize: Support fminimumnum and fmaximumnum (#131781 ) Support auto-vectorize for fminimum_num and fmaximum_num. For ARM64 with SVE, scalable vector cannot support yet. --------- Co-authored-by: Your Name <you@example.com>	2025-04-15 08:08:45 +08:00
Han-Kuan Chen	123993fd97	[SLP] Make getSameOpcode support interchangeable instructions. (#133888 ) We use the term "interchangeable instructions" to refer to different operators that have the same meaning (e.g., `add x, 0` is equivalent to `mul x, 1`). Non-constant values are not supported, as they may incur high costs with little benefit. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2025-04-14 19:23:18 +08:00
Alexey Bataev	38e64b1a84	[SLP]Fix minbiwidth analysis for gather nodes with SIToFP users If the buildvector node has cast to float user, it cannot be considered as safe for truncation, need to use the original bitwidth here. Fixes #135410	2025-04-11 11:40:41 -07:00
Alexey Bataev	c9ad5bed7f	[SLP][NFC]Add a test with the incorrect type promotion after bitwidth analysis, NFC	2025-04-11 11:10:01 -07:00
Alexey Bataev	a2d129b792	[SLP]Fix a crash when trying to reduce in revec after minbitwidth analysis Need to use the original scalar type, when building the reduction, and use the scalar type, when performing casting, to avoid compiler crash.	2025-04-11 10:58:39 -07:00
Alexey Bataev	33af951f3f	[SLP]Synchronize cost of gather/buildvector nodes with codegen If the buildvector node contains constants and non-constants, need to consider shuffling of the constant vec and insertion of unique elements into the vector. Also, if there is an input vector, need to consider the cost of shuffling source vector and constant vector and then insertion and shuffling of the non-constant elements. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/135245	2025-04-11 09:42:34 -04:00
Han-Kuan Chen	b99a2b6221	[SLP][REVEC] Update test. The test is affected by commit aaaa2a325bd1abb8c87e0171384fd2c42da5e38a.	2025-04-11 03:01:09 -07:00
Han-Kuan Chen	d77dc87511	[SLP][REVEC] Fix type comparison and mask transformation for REVEC. (#135310 ) When REVEC is enabled, ScalarTy may be a FixedVectorType. Compare its element type to decide if casting is needed. Also apply mask transformation accordingly.	2025-04-11 17:28:34 +08:00
Ulrich Weigand	80267f8148	Support z17 processor name and scheduler description (#135254 ) The recently announced IBM z17 processor implements the architecture already supported as "arch15" in LLVM. This patch adds support for "z17" as an alternate architecture name for arch15. This patch also add the scheduler description for the z17 processor, provided by Jonas Paulsson.	2025-04-11 00:20:58 +02:00
Alexey Bataev	aaaa2a325b	[SLP]Support vectorization of previously vectorized scalars in split nodes Patch removes the restriction for the revectorization of the previously vectorized scalars in split nodes, and moves the cost profitability check to avoid regressions. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/134286	2025-04-10 12:06:38 -04:00
Alexey Bataev	4ea57b3481	[SLP]Fix detection of matching splat vector Need to check, that the mask of the potentially matching splat node is not less defined than the requested mask to avoid poison propagation and incorrect code. Fixes #135113	2025-04-10 08:30:43 -07:00
Alexey Bataev	396e2ef3b7	[SLP][NFC]Add a test with incorrect identity match for less-defined splat	2025-04-10 08:20:28 -07:00
Han-Kuan Chen	a693f23ef2	[SLP][REVEC] Fix CompressVectorize does not expand mask when REVEC is enabled. (#135174 )	2025-04-10 23:07:45 +08:00
Han-Kuan Chen	d02a704ec9	[SLP][REVEC] Make getExtractWithExtendCost support FixedVectorType as Dst. (#134822 )	2025-04-10 18:54:45 +08:00
Alexey Bataev	076318bd78	[SLP]Use proper order when calculating costs for geps/extracts to correctly identify profitability Need to reorder properly the scalars, when evaluating the costs for the external uses/geps to prevent differences in the calculating of the profitability costs, used to choose between gather/compressed loads. Fixes https://github.com/llvm/llvm-project/pull/132099#issuecomment-2789627454	2025-04-09 07:43:23 -07:00
Alexey Bataev	8b34986072	[SLP][NFC]Add a test with potential segmented loads, recognized as strided	2025-04-08 13:22:16 -07:00
Han-Kuan Chen	2347aa1fcc	[SLP][REVEC] Fix the mismatch between the result of getAltInstrMask and the VecTy argument of TargetTransformInfo::isLegalAltInstr. (#134795 ) We cannot determine ScalarTy from VL because some ScalarTy is determined from VL[0]->getType(), while others are determined from getValueType(VL[0]). Fix "Mask and VecTy are incompatible".	2025-04-08 22:29:11 +08:00
Han-Kuan Chen	97c4cb4d13	[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134763 )	2025-04-08 22:29:03 +08:00
Han-Kuan Chen	d7354e337a	[SLP][REVEC] Fix ShuffleVector does not consider alternate instruction. (#134599 )	2025-04-08 08:04:43 +08:00
Alexey Bataev	f413772b31	[SLP]Fix last instruction selection for vectorized last instruction in SplitVectorize nodes If the last instruction in the SplitVectorize node is vectorized and scheduled as part of some bundles, the SplitVectorize node might be placed in the wrong order, leading to a compiler crash. Need to check if the vectorized node has vector value and place the SplitVectorize node after the vector instruction to prevent a compile crash. Fixes issue reported in https://github.com/llvm/llvm-project/pull/133091#issuecomment-2782826805	2025-04-07 09:27:08 -07:00
Alexey Bataev	19aec00735	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-04 11:14:49 -07:00
Alexey Bataev	90cf2e31ab	Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved" This reverts commit daab7d08078bb7cd37c66b78a56f4773e6b12fba to fix a crash reported in https://github.com/llvm/llvm-project/issues/134411.	2025-04-04 10:09:39 -07:00
Alexey Bataev	daab7d0807	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:17:40 -07:00
Alexey Bataev	7c4013d591	Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved" This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.	2025-04-03 12:58:49 -07:00
Alexey Bataev	0bec0f5c05	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:21:22 -04:00

1 2 3 4 5 ...

2225 Commits