2374 Commits

Author SHA1 Message Date
Michael Bedy
a61889580e
[SLP] Invariant loads cannot have a memory dependency on stores. (#167929) 2025-11-18 09:35:29 +01:00
Alexey Bataev
306b5a3d64 [SLP]Do not consider split nodes, when checking parent PHI-based nodes
The compiler should not consider split vectorize nodes, when checking
for non-schedulable PHI-based parent nodes. Only pure PHI nodes must be
  considered, they only can be considered as explicit users, split nodes
  are not.

Fixes #168268
2025-11-16 12:39:58 -08:00
Alexey Bataev
326d4e9033 [SLP]Check if the copyable element is a sub instruciton with abs in isCommutable
Need to check if the non-copyable element is an instruction before actually
trying to check its NSW attribute.
2025-11-14 16:09:50 -08:00
Alexey Bataev
e8cc0d2207 Revert "[SLP]Check if the copyable element is a sub instruciton with abs in isCommutable"
This reverts commit ddf5bb0a2e2d2dd77bce66173387d62ab7174d9f to fix
buildbots  https://lab.llvm.org/buildbot/#/builders/11/builds/28083.
2025-11-14 15:22:55 -08:00
Alexey Bataev
ddf5bb0a2e [SLP]Check if the copyable element is a sub instruciton with abs in isCommutable
Need to check if the non-copyable element is an instruction before actually
trying to check its NSW attribute.
2025-11-14 14:53:42 -08:00
Alexey Bataev
0a5be0f997
[SLP]Enable Sub as a base instruction in copyables
Patch adds support for sub instructions as main instruction in copyables
elements. Also, adds a check if the base instruction is not profitable
for the selection if at least one instruction with the main opcode is
  used as an immediate operand.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/163231
2025-11-14 12:30:38 -05:00
Alexey Bataev
75ef0be0c3 [SLP]Be careful when trying match/vectorize copyable nodes with external uses only
Need to be careful when trying to match and/or build copyable node with
the instructions, used outside the block only and if their operands
immediately precede such instructions. In this case insertion point
might be the same and it may cause broken def-use chain.

Fixes #167366
2025-11-11 12:05:56 -08:00
Kazu Hirata
1b3eaacb9d
[llvm] Remove unused local variables (NFC) (#167185)
Identified with bugprone-unused-local-non-trivial-variable.
2025-11-08 22:56:03 -08:00
Kazu Hirata
0028ef667a
[llvm] Remove unused local variables (NFC) (#167106)
Identified with bugprone-unused-local-non-trivial-variable.
2025-11-08 07:41:07 -08:00
Alexey Bataev
96806a7ec3 [SLP]Gather copyable node, if its parent is copyable, but this node is still used outside of the block only
If the current node is a copyable node and its parent is copyable too
and still current node is only used outside, better to cancel scheduling
for such node, because otherwise there might be wrong def-use chain
  built during vectorization.

Fixes #166775
2025-11-06 11:16:55 -08:00
Alexey Bataev
7d5659083c [SLP]Do not create copyable node, if parent node is non-schedulable and has a use in binop.
If the parent node is non-schedulable (only externally used instructions), and at least one instruction has multiple uses and used in the binop, such copyable node should be created. Otherwise, it may contain wrong def-use chain model, which cannot be effective detected.

Fixes #166035
2025-11-03 08:00:22 -08:00
Alexey Bataev
964c7711f4 [SLP]Fix the minbitwidth analysis for slternate opcodes
If the laternate operation is more stricter than the main operation, we
cannot rely on the analysis of the main operation. In such case, better
to avoid doing the analysis at all, since it may affect the overall
result and lead to incorrect optimization

Fixes #165878
2025-10-31 15:25:13 -07:00
Alexey Bataev
db6ba82acc [SLP] Do not match the gather node with copyable parent, containing insert instruction
If the gather/buildvector node has the match and this matching node has
a scheduled copyable parent, and the parent node of the original node
has a last instruction, which is non-schedulable and is part of the
schedule copyable parent, such matching node should be excluded as
non-matching, since it produces wrong def-use chain.

Fixes #165435
2025-10-29 11:50:47 -07:00
Alexey Bataev
cf1f4896a7 [SLP]Check only instructions with unique parent instruction user
Need to re-check the instruction with the non-schedulable parent, only
if this parent has a user phi node (i.e. it is used only outside the
  block) and the user instruction has unique parent instruction.

Fixes issue reported in 20675ee67d (commitcomment-168863594)
2025-10-28 11:14:18 -07:00
Alexey Bataev
a7b188983f [SLP]Consider non-inst operands, when checking insts, used outside only
If the instructions in the node do not require scheduling and used
outside basic block only, still need to check, if their operands are
non-inst too. Such nodes should be emitted in the beginning of the
block.

Fixes #165151
2025-10-26 12:53:48 -07:00
Alexey Bataev
20675ee67d [SLP] Check all copyable children for non-schedulable parent nodes
If the parent node is non-schedulable and it includes several copies of
the same instruction, its operand might be replaced by the copyable
nodes in multiple children nodes, and if the instruction is commutative,
they can be used in different operands. The compiler shall consider this
opportunity, taking into account that non-copyable children are
scheduled only ones for the same parent instruction.

Fixes #164242
2025-10-21 06:39:49 -07:00
Alexey Bataev
8521ffdfaa Revert "[SLP] Check all copyable children for non-schedulable parent nodes"
This reverts commit e7f370f910701b6c67d41dab80e645227692c58b to fix
buildbots  https://lab.llvm.org/buildbot/#/builders/213/builds/1056.
2025-10-20 17:37:32 -07:00
Alexey Bataev
e7f370f910 [SLP] Check all copyable children for non-schedulable parent nodes
If the parent node is non-schedulable and it includes several copies of
the same instruction, its operand might be replaced by the copyable
nodes in multiple children nodes, and if the instruction is commutative,
they can be used in different operands. The compiler shall consider this
opportunity, taking into account that non-copyable children are
scheduled only ones for the same parent instruction.

Fixes #164242
2025-10-20 15:52:28 -07:00
Alexey Bataev
154138c25f [SLP]Do not pack div-like copyable values
If a main instruction in the copyables is a div-like instruction, the
compiler cannot pack duplicates, extending with poisons, these
instructions, being vectorize, will result in undefined behavior.

Fixes #164185
2025-10-20 05:19:42 -07:00
Alexey Bataev
e6b0be3764 [SLP]Correctly calculate number of copyable operands
The compiler shall not check for overflow of the number of copyable
operands counter, otherwise non-copyable operand can be counted as
copyable and lead to a compiler crash.

Fixes #164164
2025-10-19 12:14:39 -07:00
Mikhail Gudim
eb5de5c60c
[SLPVectorizer] Refactor isStridedLoad, NFC. (#163844)
Move the checks that all strides are the same from `isStridedLoad` to a
new function `analyzeConstantStrideCandidate`. This is to reduce the
diff for the following MRs which will modify the logic in
`analyzeConstantStrideCandidate` to cover the case of widening of the
strided load. All the checks that are left in `isStridedLoad` will be
reused.
2025-10-18 05:01:50 -04:00
Alexey Bataev
0fdfad37d8 [SLP]Fix insert point for copyable node with the last inst, used only outside the block
If the copyable entry has the last instruction, used only outside the
block, tha insert ion point for the vector code should be the last
instruction itself, not the following one. It prevents wrong def-use
sequences, which might be generated for the buildvector nodes.

Fixes #163404
2025-10-17 05:59:48 -07:00
Mircea Trofin
7699762612
[slp][profcheck] Mark selects as having unknown profile (#162960)
There are 2 cases: 

- either the `select`​ condition is a vector of bools, case in which we don't currently have a way to represent the per-element branch probabilities anyway;
- or the select condition is a scalar, for example from a `llvm.vector.reduce`​. We could potentially try and do more here - if the reduced vector contained conditions from other selects, for instance

In either case, IIUC, chances are the `select`​ doesn't get lowered to a branch, at least I'm not seeing any evidence of that in an internal complex application (CSFDO + ThinLTO). Seems sufficient to mark the selects are unknown (for profiled functions); since that metadata carries with it the pass name (`DEBUG_TYPE`​) that marked it as such, we can revisit this if we detect later lowerings of these selects that would have required an actual profile.



Issue #147390
2025-10-13 09:06:16 -07:00
Alexey Bataev
739bfdeb91
[SLP]Enable support for logical ops in copyables (#162945)
Allows to use And, Or and Xor instructions as base for copyables.
2025-10-13 08:01:32 -04:00
Alexey Bataev
d81ffd4ebb [SLP]INsert postponed vector value after all uses, if the parent node is PHI
Need to insert the vector value for the postponed gather/buildvector
node after all uses non only if the vector value of the user node is
phi, but also if the user node itself is PHI node, which may produce
vector phi + shuffle.

Fixes #162799
2025-10-12 13:41:08 -07:00
Alexey Bataev
8f168376c1 [SLP]Support non-ordered copyable argument in non-commutative instructions
If the non-commutative user has several same operands and at least one
of them (but not the first) is copyable, need to consider this
opportunity when calculating the number of dependencies. Otherwise, the
schedule bundle might be not scheduled correctly and cause a compiler
crash

Fixes #162925
2025-10-12 10:28:19 -07:00
Alexey Bataev
d3233e806e [SLP]Do not allow undefs being combined with divs
Undefs/poisons with divs in vector operations lead to undefined
behavior, disabling this combination

Fixes #162663
2025-10-10 16:59:05 -07:00
Mikhail Gudim
d78c93077b
[SLPVectorizer] Move size checks (NFC). (#161867)
Add the `analyzeRtStrideCandidate` function. In the future commits we're
going to add the capability to widen strided loads to it. So, in this
commit, we move the size / type checks into it, since it can possibly
change size / type of load.
2025-10-10 20:52:17 +00:00
Alexey Bataev
7f03b22dce
[SLP]Enable SDiv/UDiv support as main op in copyables (#161892)
Allow SDiv/UDiv as a main operation in copyables support
2025-10-08 07:28:06 -04:00
Alexey Bataev
5d7f324614
[SLP]Enable Shl as a base opcode in copyables (#156766)
Enables Shl matching for the nodes, where copyable can be modelled as
shl %v, 0
2025-10-06 07:07:37 -04:00
Mikhail Gudim
ec982fac1d
[SLPVectorizer] Change arguments of 'isStridedLoad' (NFC) (#160401)
This is needed to reduce the diff for the future work on widening
strided loads. Also, with this change we'll be able to re-use this for
the case when each pointer represents a start of a group of contiguous
loads.
2025-10-01 16:07:19 -04:00
Mikhail Gudim
e485d5e77a
[SLPVectorizer] Clear TreeEntryToStridedPtrInfoMap. (#160544)
We need to clear `TreeEntryToStridedPtrInfoMap` in `deleteTree`.
2025-09-30 09:25:32 -04:00
Alexey Bataev
1f82553e38 [SLP]Fix mixing xor instructions in the same opcode analysis
Xor with 0 operand should not be compatible with multiplications-based
instructions, only with or/xor/add/sub.

Fixes #161140
2025-09-29 11:14:06 -07:00
Alexey Bataev
57947ace14 [SLP]Correctly set the insert point for insertlements with copyable arguments
Need to find the last insertelement instruction in the list for the
copyable arguments, otherwise wrong def-use chain may be built

Fixes #160671
2025-09-25 15:09:23 -07:00
Mikhail Gudim
8bbd95a188
[SLPVectorizer] Move size checks (NFC) (#159361)
Move size checks inside `isStridedLoad`. In the future we plan to
possibly change the size and type of strided load there.
2025-09-23 13:15:05 -04:00
Alexey Bataev
8c41859a21 [SLP]Clear the operands deps of non-schedulable nodes, if previously all operands were copyable
If all operands of the non-schedulable nodes were previously only
copyables, need to clear the dependencies of the original schedule data
for such copyable operands and recalculate them to correctly handle
  number of dependecies.

Fixes #159406
2025-09-18 12:11:33 -07:00
Ramkumar Ramachandra
7fb3a91418
[PatternMatch] Introduce match functor (NFC) (#159386)
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.

Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-17 21:04:33 +01:00
Piotr Fusik
2ce04d0a41
[SLP][NFC] Refactor a long if into an early return (#156410) 2025-09-17 18:31:46 +02:00
Mikhail Gudim
66a8f47066
[SLPVectorizer][NFC] Save stride in a map. (#157706)
In order to avoid recalculating stride of strided load twice save it in
a map.
2025-09-16 09:02:09 -04:00
Alexey Bataev
f2301be0e8 [SLP]Add a check if the user itself is commutable
If the commutable instruction can be represented as a non-commutable
vector instruction (like add 0, %v can be represented as a part of sub
nodes with operation sub %v, 0), its operands might still be reordered
and this should be accounted when checking for copyables in operands

Fixes #158293
2025-09-15 12:50:03 -07:00
Mikhail Gudim
ee3a4f4c94
[SLPVectorizer] Test -1 stride loads. (#158358)
Add a test to generate -1 stride load and flags to force this behaviour.
2025-09-14 15:29:28 -04:00
Garth Lei
8a8a810506
[SLP][NFC] Remove unused local variable in lambda (#156835) 2025-09-11 02:05:55 +00:00
Alexey Bataev
0dddfab54c [SLP]Recalculate deps if the original instruction scheduled after being copyable
If the original instruction is going to be scheduled after same
instruction being scheduled as copyable, need to recalculate
dependencies. Otherwise, the dependencies maybe calculated incorrectly.
2025-09-10 10:18:45 -07:00
Alexey Bataev
d0ea176cce [SLP]Do not consider SExt/ZExt profitable for demotion, if the user is a bitcast to float
If the user node of the SExt/ZExt node is a bitcast to a float point
type, the node itself should not be considered legal to demote, since
still the casting is required to match the size of the float point type.

Fixes #157277
2025-09-08 07:59:01 -07:00
Alexey Bataev
fd93dc5ac5 [SLP]Correctly schedule standalone schedule data, which is part of tree entry
If a standalone schedule data relates to a vectorized instruction, still
need to schedule it as a part of pseudo-bundle to correctly handle
dependencies between its child nodes.
2025-09-07 17:08:37 -07:00
Alexey Bataev
c4d927ce09 Revert "[SLP]Correctly schedule standalone schedule data, which is part of tree entry"
This reverts commit 57cae2b6a275a8eb3bc8935973263ed84535fb81 to fix
a buildbot https://lab.llvm.org/buildbot/#/builders/169/builds/14776
2025-09-07 13:27:12 -07:00
Alexey Bataev
57cae2b6a2 [SLP]Correctly schedule standalone schedule data, which is part of tree entry
If a standalone schedule data relates to a vectorized instruction, still
need to schedule it as a part of pseudo-bundle to correctly handle
dependencies between its child nodes.
2025-09-07 10:54:40 -07:00
Alexey Bataev
9a3aedb093 [SLP]Do not try to schedule bundle with non-schedulable parent with commutable instructions
Commutable instruction can be reordering during tree building, and if
the parent node is not scheduled, its ScheduleData elements are
considered independent and compiler do not looks for reordered operands.
Need to cancel scheduling of copyables in this case.
2025-09-04 12:57:14 -07:00
Mikhail Gudim
fdace1ca45
[SLP][NFC]Extract SCEVExpander from calculateRtStride, NFC
Make `calculateRtStride` return the SCEV of rt stride value and let the
caller expand it where needed.
2025-09-03 09:27:16 -04:00
Alexey Bataev
005f0fa40e [SLP]Improved/fixed FMAD support in reductions
In the initial patch for FMAD, potential FMAD nodes were completely
excluded from the reduction analysis for the smaller patch. But it may
cause regressions.

This patch adds better detection of scalar FMAD reduction operations and
tries to correctly calculate the costs of the FMAD reduction operations
(also, excluding the costs of the scalar fmuls) and split reduction
operations, combined with regular FMADs.

Fixed the handling for reduced values with many uses.

Reviewers: RKSimon, gregbedwell, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/152787
2025-09-02 13:09:57 -07:00