llvm-project

Author	SHA1	Message	Date
Alexey Bataev	8c41859a21	[SLP]Clear the operands deps of non-schedulable nodes, if previously all operands were copyable If all operands of the non-schedulable nodes were previously only copyables, need to clear the dependencies of the original schedule data for such copyable operands and recalculate them to correctly handle number of dependecies. Fixes #159406	2025-09-18 12:11:33 -07:00
Alexey Bataev	f2301be0e8	[SLP]Add a check if the user itself is commutable If the commutable instruction can be represented as a non-commutable vector instruction (like add 0, %v can be represented as a part of sub nodes with operation sub %v, 0), its operands might still be reordered and this should be accounted when checking for copyables in operands Fixes #158293	2025-09-15 12:50:03 -07:00
Mikhail Gudim	ee3a4f4c94	[SLPVectorizer] Test -1 stride loads. (#158358 ) Add a test to generate -1 stride load and flags to force this behaviour.	2025-09-14 15:29:28 -04:00
Antonio Frighetto	370607065d	[llvm] Regenerate test checks including TBAA semantics (NFC) Tests exercizing TBAA metadata (both purposefully and not), and previously generated via UTC, have been regenerated and updated to version 6.	2025-09-12 20:01:17 +02:00
Alexey Bataev	0dddfab54c	[SLP]Recalculate deps if the original instruction scheduled after being copyable If the original instruction is going to be scheduled after same instruction being scheduled as copyable, need to recalculate dependencies. Otherwise, the dependencies maybe calculated incorrectly.	2025-09-10 10:18:45 -07:00
Alexey Bataev	d0ea176cce	[SLP]Do not consider SExt/ZExt profitable for demotion, if the user is a bitcast to float If the user node of the SExt/ZExt node is a bitcast to a float point type, the node itself should not be considered legal to demote, since still the casting is required to match the size of the float point type. Fixes #157277	2025-09-08 07:59:01 -07:00
Alexey Bataev	fd93dc5ac5	[SLP]Correctly schedule standalone schedule data, which is part of tree entry If a standalone schedule data relates to a vectorized instruction, still need to schedule it as a part of pseudo-bundle to correctly handle dependencies between its child nodes.	2025-09-07 17:08:37 -07:00
Alexey Bataev	c4d927ce09	Revert "[SLP]Correctly schedule standalone schedule data, which is part of tree entry" This reverts commit 57cae2b6a275a8eb3bc8935973263ed84535fb81 to fix a buildbot https://lab.llvm.org/buildbot/#/builders/169/builds/14776	2025-09-07 13:27:12 -07:00
Alexey Bataev	57cae2b6a2	[SLP]Correctly schedule standalone schedule data, which is part of tree entry If a standalone schedule data relates to a vectorized instruction, still need to schedule it as a part of pseudo-bundle to correctly handle dependencies between its child nodes.	2025-09-07 10:54:40 -07:00
Phoebe Wang	94b164c218	[X86][AVX10] Remove EVEX512 and AVX10-256 implementations (#157034 ) The 256-bit maximum vector register size control was removed from AVX10 whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343 We have warned these options in LLVM21 through #132542. This patch removes underlying implementations in LLVM22.	2025-09-05 14:08:59 +00:00
Alexey Bataev	9a3aedb093	[SLP]Do not try to schedule bundle with non-schedulable parent with commutable instructions Commutable instruction can be reordering during tree building, and if the parent node is not scheduled, its ScheduleData elements are considered independent and compiler do not looks for reordered operands. Need to cancel scheduling of copyables in this case.	2025-09-04 12:57:14 -07:00
Alexey Bataev	005f0fa40e	[SLP]Improved/fixed FMAD support in reductions In the initial patch for FMAD, potential FMAD nodes were completely excluded from the reduction analysis for the smaller patch. But it may cause regressions. This patch adds better detection of scalar FMAD reduction operations and tries to correctly calculate the costs of the FMAD reduction operations (also, excluding the costs of the scalar fmuls) and split reduction operations, combined with regular FMADs. Fixed the handling for reduced values with many uses. Reviewers: RKSimon, gregbedwell, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/152787	2025-09-02 13:09:57 -07:00
Alexey Bataev	6d902b67cd	Revert "[SLP]Improved/fixed FMAD support in reductions" This reverts commit 74230ff2791384fb3285c9e9ab202056959aa095 to fix the bugs found during local testing.	2025-09-02 07:58:29 -07:00
Alexey Bataev	74230ff279	[SLP]Improved/fixed FMAD support in reductions In the initial patch for FMAD, potential FMAD nodes were completely excluded from the reduction analysis for the smaller patch. But it may cause regressions. This patch adds better detection of scalar FMAD reduction operations and tries to correctly calculate the costs of the FMAD reduction operations (also, excluding the costs of the scalar fmuls) and split reduction operations, combined with regular FMADs. Reviewers: RKSimon, gregbedwell, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/152787	2025-09-01 17:01:36 -04:00
Alexey Bataev	a80a1988f7	[SLP]Better support for copyable values in stores Currently stores are sorted by the stored values instruction types, which do not include analysis for copyables. The compiler may miss some potential vectorization opportunities because of that. Patch adds detection of the copyables in stored values. Reviewers: hiraditya, HanKuanChen, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/153213	2025-09-01 16:09:52 -04:00
Sam Tebbs	37127f74f4	[LV] Bundle sub reductions into VPExpressionRecipe (#147255 ) This PR bundles sub reductions into the VPExpressionRecipe class and adjusts the cost functions to take the negation into account. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. -> https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-09-01 17:25:01 +01:00
Alexey Bataev	7730ebce8e	[SLP]Do not to try to revectorize previously vectorized phis in loops No need to try to revectorize previously vectorized phis in loops, it leads to a compile time blow-up. Fixes #155998	2025-08-31 10:54:20 -07:00
Alexey Bataev	e5a4ea20c5	[SLP]Do not remove reduced value, if it is a copyable If the value is checked for the reduction and it is a copyable element in a root node, it should not be deleted, since it may still be used after vectorization. Fixes #155512	2025-08-31 09:09:39 -07:00
Alexey Bataev	eb39605192	[SLP]Do not schedule terminate copyable from main op basic block If the copyable instruction is a terminate instruction from the same block, as the potential main instruction, such instruction cannot be copyable and the value list cannot be modeled as instructions with same (and copyables) opcodes. Fixes #155183	2025-08-30 18:05:08 -07:00
Alexey Bataev	2824b3c00e	[SLP] Try to recalculate deps only for nodes with previously valid deps Need to recalculate the dependencies only for nodes, which have valid deps before they gets cleared because of the copyable nodes. Otherwise, no need to recaculate the dependencies to prevent a crash.	2025-08-30 14:20:50 -07:00
Mikhail Gudim	fe6b611d58	[RISCV] Unaligned vec mem => prefer alt opc vec Return `true` in `RISCVTTIImpl::preferAlternateOpcodeVectorization` if subtarget supports unaligned memory accesses.	2025-08-30 04:56:01 -04:00
Mikhail Gudim	fda67dc5b7	[RISCV][NFC] Precommit a test for SLP behavior... when the subtarget has unaligned-vector-mem feature.	2025-08-29 22:09:46 +00:00
Alexey Bataev	b157599156	[SLP]Do not include copyable data to the same user twice If the copyable schedule data is created and the user is used several times in the user node, no need to count same data for the same user several times, need to include it only ones. Fixes #153754	2025-08-15 12:36:45 -07:00
Alexey Bataev	09f5b9ab0a	Revert "[SLP]Do not include copyable data to the same user twice" This reverts commit 758c6852c3ffe6b5e259cafadd811e60d8c276fb to fix buildbot https://lab.llvm.org/buildbot/#/builders/195/builds/13298	2025-08-15 12:08:31 -07:00
Alexey Bataev	758c6852c3	[SLP]Do not include copyable data to the same user twice If the copyable schedule data is created and the user is used several times in the user node, no need to count same data for the same user several times, need to include it only ones. Fixes #153754	2025-08-15 11:47:35 -07:00
Alexey Bataev	13b54f7dc1	[SLP] Recalculate dependencies for potential control dependencies if cleared If the control dependecies are cleared after calcellation of the copyables, need to reclculate them unconditionally. Fixes #153754 #153676	2025-08-15 07:52:10 -07:00
Alexey Bataev	bf2f241458	[SLP]Support LShr as base for copyable elements Added support for LShr instructions as base for copyable elements. Also, added simple analysis for best base instruction selection, if multiple candidates are available. Fixed scheduling after cancellation Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/153393	2025-08-14 19:12:27 -07:00
Alex Bradbury	db5f7dc374	Revert "[SLP]Support LShr as base for copyable elements" This reverts commit ca4ebf95172d24f8c47655709b2c9eb85bda5cb2. Causes compile-time crashes for some inputs with RVV zvl512b/zvl1024b configurations. See here for a minimal reproducer: https://github.com/llvm/llvm-project/pull/153393#issuecomment-3189898813	2025-08-14 22:18:24 +01:00
David Green	5836bae463	[AArch64] Change the cost of fma and fmuladd to match fmul. (#152963 ) As fmul and fmadd are so similar, their performance characteristics tend to be the same on most platforms, at least in terms of reciprocal throughputs. Processors capable of performing a given number of fmul per cycle can usually perform the same number of fma, with the extra add being relatively simple on top. This patch makes the scores of the two operations the same, which brings the throughput cost of a fma/fmuladd to 2, and the latency to 3, which are the defaults for fmul. Note that we might also want to change the throughput cost of a fmul to 1, as most processors have ample bandwidth for them, but they should still stay in-line with one another.	2025-08-14 21:53:45 +01:00
Alexey Bataev	ca4ebf9517	[SLP]Support LShr as base for copyable elements Added support for LShr instructions as base for copyable elements. Also, added simple analysis for best base instruction selection, if multiple candidates are available. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/153393	2025-08-14 12:35:28 -04:00
Alexey Bataev	d57ab276b6	[SLP] Recalculate cleared deps for potential control schedule data nodes Need to recalculate the dependencies for all potential control data schedule nodes to prevent compiler crash. Fixes #153571	2025-08-14 09:00:42 -07:00
Alexey Bataev	dd5ba694bd	[SLP]Recalculate deps for potential control-dependent schedule data After clearing the dependencies in copyable data, need to recalculate dependencies for the original ScheduleData, if it can be marked as control dependent. Fixes #153289	2025-08-13 08:18:26 -07:00
Sam Tebbs	0bfa1718af	[LV] Create in-loop sub reductions (#147026 ) This PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Stacked PRs: 1. -> https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-08-12 10:22:41 +01:00
Alexey Bataev	2d7b55a028	[SLP]Initial support for copyable elements Adds initial support for copyable elements, both schedulable and non-schedulable. Adds support only for add for now, other opcodes will added in future. Still some cases are not handled, e.g. stores do not include this, because currently do not check for copyable elements. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/147366	2025-08-11 09:41:19 -04:00
Alexey Bataev	67af2f6c5c	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization. Added the check for instruction to fix #152683 Skipped the code for reduction to avoid regressions.	2025-08-11 05:53:55 -07:00
David Green	cfe190979e	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0fffb9f9ed81f4c2084b8fe040c88b60bb6c372a due to major performance regressions.	2025-08-10 15:16:01 +01:00
Alexey Bataev	0fffb9f9ed	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization. Added the check for instruction to fix #152683	2025-08-08 10:30:23 -07:00
Alexey Bataev	0419b459be	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0bcf45ea3458ba79eb4257afcfd6af954292c9ce to fix the regresions, reported in https://github.com/llvm/llvm-project/issues/152683	2025-08-08 09:17:59 -07:00
Alexey Bataev	adae370805	[SLP][NFC]Cleanup undefs and the whole test, NFC	2025-08-07 13:41:22 -07:00
Alexey Bataev	0bcf45ea34	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization.	2025-08-07 09:51:43 -04:00
Ramkumar Ramachandra	edeee824f0	Reland [VectorUtils] Trivially vectorize ldexp, [l]lround (#152476 ) Changes: The original patch, landed as 1336675, was reverted due to a bug in LoopVectorize resulting in a crash. The bug has now been fixed by 95c32bf ([VPlan] Return invalid cost if any skeleton block has invalid costs), and this reland is identical to the original patch.	2025-08-07 12:07:29 +01:00
Mikhail Gudim	3404c0b013	Slp basic test (#152355 ) Add a basic test for SLPVectorizer to make sure that upcoming refactoring patches don't break anything. Also, record a test for a missed opportunity.	2025-08-06 14:54:50 -04:00
Alexey Bataev	e27831ff9b	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699 Fixed check for non-supported binary operations.	2025-08-04 11:20:54 -07:00
Michael Halkenhäuser	70af09e3a1	Revert "[SLP] Fix a check for main/alternate interchanged instruction" (#151997 ) This reverts commit 3ee8d047109ea4bb479095f4b153c2120a8d726c. Revert reason: FAILED build for openmp-offload-amdgpu-runtime-2 https://lab.llvm.org/buildbot/#/builders/10/builds/10827	2025-08-04 12:57:20 -04:00
Alexey Bataev	3ee8d04710	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699	2025-08-04 08:31:35 -07:00
Alexey Bataev	7cd1ce3aa0	[SLP]Check vector-like instruction for dominance in copyables Need to check if the vector-like instruction is dominated by main operation in the copyables to prevent broken def-use chain Fixes #151456	2025-08-04 06:14:19 -07:00
David Green	b30d5315b7	[AArch64] Add better fcmp costs for expanded predicates (#147940 ) Certain fcmp predicates need to be expanded into multiple operations and or'd together. This adds some more accurate cost modelling for them based on the predicate. Unsupported operations are given the cost of a libcall and the latency is set to 2 as that seemed to be fairly common between different CPUs.	2025-08-04 13:42:57 +01:00
Muhammad Omair Javaid	176d54aa33	Revert "[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545 )" This reverts commit 13366759c3b9db9366659d870cc73c938422b020. This broke various LLVM testsuite buildbots for AArch64 SVE, but the problem got masked because relevant buildbots were already failing due to other breakage. It has broken llvm-test-suite test: gfortran-regression-compile-regression__vect__pr106253_f.test https://lab.llvm.org/buildbot/#/builders/4/builds/8164 https://lab.llvm.org/buildbot/#/builders/17/builds/9858 https://lab.llvm.org/buildbot/#/builders/41/builds/8067 https://lab.llvm.org/buildbot/#/builders/143/builds/9607	2025-08-01 01:24:52 +05:00
Ramkumar Ramachandra	13366759c3	[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545 )	2025-07-29 19:23:09 +01:00
Simon Pilgrim	0fa0ce1f3a	[CostModel][X86] Update SK_Broadcast based on cost kinds (#150620 ) When these were converted to CostKindTblEntry the throughput was mainly copied to all cost kinds Regenerated with my check_cost_tables.py helper script	2025-07-26 13:52:47 +01:00

1 2 3 4 5 ...

2326 Commits