2427 Commits

Author SHA1 Message Date
Alexey Bataev
3dc5259bc8 [SLP]Do not build bundle for copyables, with parents used in PHI node
If the copyables have parents, used in PHI nodes, this causes complex
schedulable/non-schedulable dependecies, which require complex
processing, but with small profitability. Cut such case early for now to
prevent compiler crashes and compile time blow up.

Fixes #176658
2026-01-18 13:37:51 -08:00
David Stone
74379c2d44
[llvm][clang] Remove llvm::OwningArrayRef (#169126)
`OwningArrayRef` has several problems.

The naming is strange: `ArrayRef` is specifically a non-owning view, so
the name means "owning non-owning view".

It has a const-correctness bug that is inherent to the interface.
`OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This
means that the following code compiles:

```c++
void const_incorrect(llvm::OwningArrayRef<int> const a) {
	a[0] = 5;
}
```

It's surprising for a non-reference type to allow modification of its
elements even when it's declared `const`. However, the problems from
this inheritance (which ultimately stem from the same issue as the weird
name) are even worse. The following function compiles without warning
but corrupts memory when called:

```c++
void memory_corruption(llvm::OwningArrayRef<int> a) {
	a.consume_front();
}
```

This happens because `MutableArrayRef::consume_front` modifies the
internal data pointer to advance the referenced array forward. That's
not an issue for `MutableArrayRef` because it's just a view. It is an
issue for `OwningArrayRef` because that pointer is passed as the
argument to `delete[]`, so when it's modified by advancing it forward it
ceases to be valid to `delete[]`. From there, undefined behavior occurs.

It is less convenient than `llvm::SmallVector` for construction. By
combining the `size` and the `capacity` together without going through
`std::allocator` to get memory, it's not possible to fill in data with
the correct value to begin with. Instead, the user must construct an
`OwningArrayRef` of the appropriate size, then fill in the data. This
has one of two consequences:

1. If `T` is a class type, we have to first default construct all of the
elements when we construct `OwningArrayRef` and then in a second pass we
can assign to those elements to give what we want. This wastes time and
for some classes is not possible.
2. If `T` is a built-in type, the data starts out uninitialized. This
easily forgotten step means we access uninitialized memory.

Using `llvm::SmallVector`, by constrast, has well-known constructors
that can fill in the data that we actually want on construction.

`OwningArrayRef` has slightly different performance characteristics than
`llvm::SmallVector`, but the difference is minimal.

The first difference is a theoretical negative for `OwningArrayRef`: by
implementing in terms of `new[]` and `delete[]`, the implementation has
less room to optimize these calls. However, I say this is theoretical
because for clang, at least, the extra freedom of optimization given to
`std::allocator` is not yet taken advantage of (see
https://github.com/llvm/llvm-project/issues/68365)

The second difference is slightly in favor of `OwningArrayRef`:
`sizeof(llvm::SmallVector<T>) == sizeof(void *) * 3` on pretty much any
implementation, whereas `sizeof(OwningArrayRef) == sizeof(void *) * 2`
which seems like a win. However, this is just a misdirection of the
accounting costs: array-new sticks bookkeeping information in the
allocated storage. There are some cases where this is beneficial to
reduce stack usage, but that minor benefit doesn't seem worth the costs.
If we actually need that optimization, we'd be better served by writing
a `DynamicArray` type that implements a full vector-like feature set
(except for operations that change the size of the container) while
allocating through `std::allocator` to avoid the pitfalls outlined
earlier.
2026-01-17 21:06:25 -07:00
Gabriel Baraldi
72a20b8e29
[SLPVectorizer] Check std::optional coming out of getPointersDiff (#175784)
Fixes https://github.com/llvm/llvm-project/issues/175768 
There are other unchecked uses std::optional in this pass but I couldn't
figure out a test that triggers them
2026-01-15 09:07:13 -06:00
Alexey Bataev
c322a0c462 [SLP]Do not throttle nodes with split parents, if any of scalars is used in more than one split nodes
If the the node to throttle is a vector node, which is used in split
node, and at least one scalar of such a node is used in many split
nodes, such vector node should be throttled. otherise there might be
wrong def-use chain, which crashes the compiler.

Fixes #175967
2026-01-15 03:50:45 -08:00
Graham Hunter
2abd6d6d7a
[LV] Vectorize conditional scalar assignments (#158088)
Based on Michael Maitland's previous work:
https://github.com/llvm/llvm-project/pull/121222

This PR uses the existing recurrences code instead of introducing a
new pass just for CSA autovec. I've also made recipes that are more
generic.
2026-01-14 14:59:18 +00:00
Ramkumar Ramachandra
d69335bac9
[LLVM] Clean up code using [not_]equal_to (NFC) (#175824)
Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce
bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner
code.
2026-01-13 21:19:39 +00:00
Alexey Bataev
a96cda0e33 [SLP]Update deps for copyables operands, if the user is used several times in node
If the user instruction is used several times in the node, and in one
cases its operand is copyable, but in another is not, need to check all
operands to be sure we do not miss scheduling
2026-01-09 15:18:31 -08:00
Alexey Bataev
125a53ce59 Revert "[SLP]Update deps for copyables operands, if the user is used several times in node"
This reverts commit 6e1acd061e74f44df6d53d54c78d1e50790456a8 to fix
crashes detected in  https://lab.llvm.org/buildbot/#/builders/25/builds/14678.
2026-01-08 14:15:25 -08:00
Alexey Bataev
6e1acd061e [SLP]Update deps for copyables operands, if the user is used several times in node
If the user instruction is used several times in the node, and in one
cases its operand is copyable, but in another is not, need to check all
operands to be sure we do not miss scheduling
2026-01-08 12:50:32 -08:00
Alex Bradbury
3ae71d30be
[SLP] Use ConstantInt::getSigned for stride argument to strided load/store intrinsics (#175007)
strided-stores-vectorized.ll crashes for RV32 without fixing the
relevant logic in vectorizeTree, because the argument can't be
represented as a 32-bit unsigned value:
```
llvm::APInt::APInt(unsigned int, uint64_t, bool, bool): Assertion `llvm::isUIntN(BitWidth, val) && "Value is not an N-bit unsigned value"' failed.
```

It is intended to be signed, so we simply use ConstantInt::getSigned
instead. This fixes other stride-related instances in the file as well.
For further context, this change is part of unblocking rv32gcv
llvm-test-suite in CI.
2026-01-08 16:45:02 +00:00
Alexey Bataev
9fb45c5959 [SLP]Do not generate extractelement subnodes with the same indeces
The compiler should not generate subvectors with the same extractelement
instructions, it may cause a crash and leads to inefficient
vectorization.

Fixes #174773
2026-01-08 07:23:06 -08:00
Alexey Bataev
39456e4226 [SLP]Do not increment dep count for non-schedulable nodes with non-schedulable parents
If the node is non-scedulable, all instructions are used outside only
and parent is non-schedulable non-phi node, the dependency count should be
increased for such nodes

Fixes #174599
2026-01-07 10:26:19 -08:00
Ryan Buchner
f180d4bb46
[SLP] Report the correct operand to getArithmeticInstrCost() when duplicated scalars (#174442)
Before, we were selecting the wrong operand in cases when Scalars
contained duplicate values. Stems from #135797.

Using:
`opt -passes=slp-vectorizer -mtriple=riscv64 -mattr=+v t.ll`
```
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64"

define void @foo(ptr noalias %A, ptr noalias %B) {
entry:
  %0 = load i32, ptr %B
  %add = add nsw i32 %0, 1
  store i32 %add, ptr %A
  %arrayidx.1 = getelementptr inbounds nuw i8, ptr %B, i64 4
  %1 = load i32, ptr %arrayidx.1
  %add.1 = add nsw i32 %1, 1
  %arrayidx2.1 = getelementptr inbounds nuw i8, ptr %A, i64 4
  store i32 %add.1, ptr %arrayidx2.1
  %arrayidx.2 = getelementptr inbounds nuw i8, ptr %B, i64 8
  %2 = load i32, ptr %arrayidx.2
  %add.2 = add nsw i32 %2, 1
  %arrayidx2.2 = getelementptr inbounds nuw i8, ptr %A, i64 8
  store i32 %add.2, ptr %arrayidx2.2
  %arrayidx.3 = getelementptr inbounds nuw i8, ptr %B, i64 12

  %arrayidx2.3 = getelementptr inbounds nuw i8, ptr %A, i64 12

  store i32 %add, ptr %arrayidx2.3
  %arrayidx.4 = getelementptr inbounds nuw i8, ptr %B, i64 16
  %4 = load i32, ptr %arrayidx.4
  %add.4 = add nsw i32 %4, 1
  %arrayidx2.4 = getelementptr inbounds nuw i8, ptr %A, i64 16
  store i32 %add.4, ptr %arrayidx2.4
  %arrayidx.5 = getelementptr inbounds nuw i8, ptr %B, i64 20
  %5 = load i32, ptr %arrayidx.5
  %add.5 = add nsw i32 %5, 1
  %arrayidx2.5 = getelementptr inbounds nuw i8, ptr %A, i64 20
  store i32 %add.5, ptr %arrayidx2.5
  %arrayidx.6 = getelementptr inbounds nuw i8, ptr %B, i64 24
  %6 = load i32, ptr %arrayidx.6
  %add.6 = add nsw i32 %6, 1
  %arrayidx2.6 = getelementptr inbounds nuw i8, ptr %A, i64 24
  store i32 %add.6, ptr %arrayidx2.6
  %arrayidx.7 = getelementptr inbounds nuw i8, ptr %B, i64 28
  %7 = load i32, ptr %arrayidx.7
  %add.7 = add nsw i32 %7, 1
  %arrayidx2.7 = getelementptr inbounds nuw i8, ptr %A, i64 28
  store i32 %add.7, ptr %arrayidx2.7
  ret void
}
```

The following trace is produced, note the wrong operand is used for `Idx
> 2`

Before:
```
GetScalarCost(), Idx=0
UniqueValues[Idx]:   %add = add nsw i32 %0, 1
Op1:   %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=1
UniqueValues[Idx]:   %add.1 = add nsw i32 %1, 1
Op1:   %1 = load i32, ptr %arrayidx.1, align 4
GetScalarCost(), Idx=2
UniqueValues[Idx]:   %add.2 = add nsw i32 %2, 1
Op1:   %2 = load i32, ptr %arrayidx.2, align 4
GetScalarCost(), Idx=3
UniqueValues[Idx]:   %add.4 = add nsw i32 %3, 1
Op1:   %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=4
UniqueValues[Idx]:   %add.5 = add nsw i32 %4, 1
Op1:   %3 = load i32, ptr %arrayidx.4, align 4
GetScalarCost(), Idx=5
UniqueValues[Idx]:   %add.6 = add nsw i32 %5, 1
Op1:   %4 = load i32, ptr %arrayidx.5, align 4
GetScalarCost(), Idx=6
UniqueValues[Idx]:   %add.7 = add nsw i32 %6, 1
Op1:   %5 = load i32, ptr %arrayidx.6, align 4
```

After:
```
GetScalarCost(), Idx=0
UniqueValues[Idx]:   %add = add nsw i32 %0, 1
Op1:   %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=1
UniqueValues[Idx]:   %add.1 = add nsw i32 %1, 1
Op1:   %1 = load i32, ptr %arrayidx.1, align 4
GetScalarCost(), Idx=2
UniqueValues[Idx]:   %add.2 = add nsw i32 %2, 1
Op1:   %2 = load i32, ptr %arrayidx.2, align 4
GetScalarCost(), Idx=3
UniqueValues[Idx]:   %add.4 = add nsw i32 %3, 1
Op1:   %3 = load i32, ptr %arrayidx.4, align 4
GetScalarCost(), Idx=4
UniqueValues[Idx]:   %add.5 = add nsw i32 %4, 1
Op1:   %4 = load i32, ptr %arrayidx.5, align 4
GetScalarCost(), Idx=5
UniqueValues[Idx]:   %add.6 = add nsw i32 %5, 1
Op1:   %5 = load i32, ptr %arrayidx.6, align 4
GetScalarCost(), Idx=6
UniqueValues[Idx]:   %add.7 = add nsw i32 %6, 1
Op1:   %6 = load i32, ptr %arrayidx.7, align 4
```
2026-01-05 22:25:25 +00:00
Alexey Bataev
f985e1a113
[SLP]Better copyable vectorization for stores with non-instructions (#174249) 2026-01-03 17:05:55 -05:00
Victor Chernyakin
c438773432
[LLVM][ADT] Migrate users of make_scope_exit to CTAD (#174030)
This is a followup to #173131, which introduced the CTAD functionality.
2026-01-02 20:42:56 -08:00
Mikhail Gudim
3572e62991
[SLPVectorizer] Widen rt stride loads (#162336)
Suppose we are given pointers of the form: `%b + x * %s + y * %c_i`
where `%c_i`s are constants and %s is a run-time fixed value.
If the pointers can be rearranged as follows:

```
 %b + 0 * %s + 0
 %b + 0 * %s + 1
 %b + 0 * %s + 2
 ...
 %b + 0 * %s + w

 %b + 1 * %s + 0
 %b + 1 * %s + 1
 %b + 1 * %s + 2
 ...
 %b + 1 * %s + w
 ...
```

It means that the memory can be accessed with a strided loads of width `w`
and stride `%s`.

This is motivated by x264 benchmark.
2026-01-02 17:06:11 -05:00
Alexey Bataev
8d75f97662 [SLP]Consider split node as potential reduction root
Need to check the first split node as a potential reduction root to
prevent compiler crash
2026-01-02 06:42:44 -08:00
Alexey Bataev
a0be4724a9 [SLP] Support for copyables in the reduced values (#153589)
Currently reductions can handles only same/alternate instructions,
skipping potential support for copyables. Patch adds support for
copyables in the reduced values.

Recommit after revert in 1febc3f088ef444af378c0a90aaba2195c30472b
2026-01-01 13:31:13 -08:00
Alexey Bataev
1febc3f088 Revert "[SLP] Support for copyables in the reduced values (#153589)"
This reverts commit 831bb12a30dbbbf69930c11846a7b62b33e0f0db to fix
buildbot https://lab.llvm.org/buildbot/#/builders/224/builds/1205
2026-01-01 08:48:40 -08:00
Alexey Bataev
831bb12a30
[SLP] Support for copyables in the reduced values (#153589)
Currently reductions can handles only same/alternate instructions,
skipping potential support for copyables. Patch adds support for
copyables in the reduced values.
2026-01-01 11:31:28 -05:00
Alexey Bataev
27cf32dafd [SLP]Fix def-after-use crash for gathered split nodes
If the split node is marked as a gather node after non-profitable
analysis, need to exclude it from the list of split nodes and include
into the list of gather/buildvector nodes

Fixes report from https://github.com/llvm/llvm-project/pull/162018#issuecomment-3701928745
2025-12-31 14:12:09 -08:00
Alexey Bataev
55e0b928b5 [SLP]Consider deleted/gathered nodes, when deciding to erase extractelement
If any user of the extractelement instruction is part of the node to be
deleted/gathered, such extractelements instructions should not be
considered for deletion.

Fixes #174020
2025-12-31 12:58:42 -08:00
Alexey Bataev
2541b1870e [SLP]Mark and incompatible for 'xor %a, 0' operations
Xor with 0 is incompatible with and, which resulst in all zero instead
of %a

https://alive2.llvm.org/ce/z/oEVETS

Fixes #174041
2025-12-31 08:30:50 -08:00
Alexey Bataev
1a8f5fa823 [SLP]Exclude non-profitable subtrees.
Initial support for SLP tree throttling. Trims non-profitable subtrees,
trying to maximize perf gains.

Does not support trees with gathered loads yet, since they are not quite
trees, but graphs. Analysis should be added later.

Reviewers: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/162018

Recommit after revert in 6ec2ec4826b51d7d809fe08b36883a78d7dc0b98 with
a fix
2025-12-30 09:32:05 -08:00
Alexey Bataev
6ec2ec4826 Revert "[SLP]Exclude non-profitable subtrees."
This reverts commit 79472d366591a39a453c186cf031dda874ddf728 to fix
a bug reported in https://github.com/llvm/llvm-project/pull/162018#pullrequestreview-3617073149
2025-12-30 05:59:07 -08:00
Alexey Bataev
79472d3665
[SLP]Exclude non-profitable subtrees.
Initial support for SLP tree throttling. Trims non-profitable subtrees,
trying to maximize perf gains.

Does not support trees with gathered loads yet, since they are not quite
trees, but graphs. Analysis should be added later.

Reviewers: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/162018
2025-12-29 13:16:37 -05:00
Alexey Bataev
ab450597da [SLP]Do not swap RHS, if it is used in bool op, used as a second operand in a reduction
If the RHS operand is used as a first operand in the bool reduction op,
used as a second operand in the reduction ops, still need to use this
RHS as RHS, not as LHS

https://alive2.llvm.org/ce/z/pmc2YJ

Fixes #173796
2025-12-28 13:33:17 -08:00
Alexey Bataev
d9ce80db7a [SLP]FIx order of bool logical ops, if the right op is used in the first reduction operarion
If the LHS of the first reduction op is not a first operand, but RHS is,
and RHS is the second operand of the first reductoin op, still need to
emit RHS as a second reduction operand, though without freeze of the
LHS operand

https://alive2.llvm.org/ce/z/2_JLBu

Fixes #173784
2025-12-28 11:52:44 -08:00
Alexey Bataev
42ea774aa6 [SLP]Enable float point math ops as copyables elements.
Patch enables support for float point math operations as base
instructions for copyable elements. It also fixes some scheduling
issues, found during testing

Reviewers: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/169857

Recommit after reverts in 9008922707915a6632fb74ed301bce11d8775e2a and
c2441689830fcb2588673dedba98da1219a2fb9e.
c2441689830fcb2588673dedba98da1219a2fb9e was caused by other issues, not
related to this patch directly
2025-12-26 11:55:58 -08:00
Alexey Bataev
571819cb79 [SLP]Recalculate dependencies for all cleared entries
Need to recalculate the dependencies for all cleared items to avoid
a crash, if the entry is used in other vector nodes

Fixes #173469
2025-12-26 11:17:14 -08:00
Alexey Bataev
a08cc6e0d5 Revert "[SLP]Recalculate dependencies for all cleared entries"
This reverts commit 2568ec6cb29da3db5bd7c848ec53a673c1431aea to
investigate crashes reported in 2568ec6cb2 (commitcomment-173523022).
2025-12-26 06:55:33 -08:00
Alexey Bataev
c244168983 Revert "[SLP]Enable float point math ops as copyables elements."
This reverts commit 48be4d07c3ca045fe831cbdf216631202c55cd62
to investigate crashes reported in 2568ec6cb2 (commitcomment-173523022).
2025-12-26 06:55:32 -08:00
Alexey Bataev
48be4d07c3 [SLP]Enable float point math ops as copyables elements.
Patch enables support for float point math operations as base
instructions for copyable elements. It also fixes some scheduling
issues, found during testing

Reviewers: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/169857

Recommit after revert in 9008922707915a6632fb74ed301bce11d8775e2a
2025-12-25 12:37:01 -08:00
Alexey Bataev
30c6bbe8d3 [SLP]Check if the value has uselist before asking for uses
Need to check if the value has uselist before asking for uses to fix
a compiler crash

Fixes #173569
2025-12-25 10:10:58 -08:00
Alexey Bataev
2568ec6cb2 [SLP]Recalculate dependencies for all cleared entries
Need to recalculate the dependencies for all cleared items to avoid
a crash, if the entry is used in other vector nodes

Fixes #173469
2025-12-24 14:15:11 -08:00
Alexey Bataev
df87e19d3a [SLP]Do not vectorize buildvector tree will scalars in first node, which should remain scalars
Such trees will be revectorized again, causing a compiler hang.

Fixes #172609
2025-12-24 06:41:32 -08:00
Alexey Bataev
9008922707 Revert "[SLP]Enable float point math ops as copyables elements."
This reverts commit e644f06c2ffc23b3415f3478b05c627303aef614 to fix
crashes found during internal testing
2025-12-22 06:48:26 -08:00
Alexey Bataev
a281656b22 Revert "[SLP][NFC]Add parens to silence a warning message, NFC"
This reverts commit 366f6eb607dab74b7be28d3bd72736273329d647.
2025-12-22 06:48:25 -08:00
Alexey Bataev
366f6eb607 [SLP][NFC]Add parens to silence a warning message, NFC 2025-12-21 12:29:38 -08:00
Alexey Bataev
e644f06c2f
[SLP]Enable float point math ops as copyables elements.
Patch enables support for float point math operations as base
instructions for copyable elements. It also fixes some scheduling
issues, found during testing

Reviewers: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/169857
2025-12-21 14:45:48 -05:00
Alexey Bataev
a88498f02c [SLP]Skip buildvector tree, if all scalars are used externally and remain scalar
If the buildvector is going to be vector with threshold cost < 0 and all
buildvector scalars are externally used and remain scalar, such a tree
should not be vectorized, it may lead to a compiler hang because same
scalars remain in the function and will be vectorized once again.

Fixes #172609
2025-12-21 10:20:48 -08:00
Alexey Bataev
b988555812 [SLP]Check if the extractelement is part of other buildvector node before marking for erasing
Need to check if the extractelement instruction is part of other
buildvector node, before trying to mark it for the deletion, otherwise
the compiler may reuse the deleted instruction.

Fixes #172221
2025-12-15 09:54:05 -08:00
Ramkumar Ramachandra
85fafd5db0
[SCEVExp] Get DL from SE, strip constructor arg (NFC) (#171823) 2025-12-11 14:26:47 +00:00
Alexey Bataev
f8d0c355f5 [SLP]Prefer instructions, ued outside the block, as the initial main copyable instructions
Instructions, used outside the block, must be considered the first
choice for the main instructionsin the copyable nodes, to avoid
use-before-def.

Fixes #171055
2025-12-08 09:46:15 -08:00
Alexey Bataev
a2a3d89e08 [SLP][NFC]Hoist invariant request for user nodes out of the loop, NFC 2025-12-04 06:57:54 -08:00
Alexey Bataev
e502dce8b5
[SLP][NFC]Simplify analysis of the scalars, NFC.
Just an attempt to simplify some checks, remove extra calls and reorder
checks to make code simpler and faster

Reviewers: RKSimon, hiraditya

Reviewed By: hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/170382
2025-12-04 08:28:38 -05:00
Nikita Popov
042a38f0bf
[Support] Optimize DebugCounter (#170305)
Currently, DebugCounters work by creating a unique counter ID during
registration, and then using that ID to look up the counter information
in the global registry.

However, this means that anything working with counters has to always go
through the global instance. This includes the fast path that checks
whether any counters are enabled.

Instead, we can drop the counter IDs, and make the counter variables use
CounterInfo themselves. We can then directly check whether the specific
counter is active without going through the global registry. This is
both faster for the fast-path where all counters are disabled, and also
faster for the case where only one counter is active (as the fast-path
can now still be used for all the disabled counters).

After this change, disabled counters become essentially free at runtime,
and we should be able to enable them in non-assert builds as well.
2025-12-03 07:55:06 +01:00
Shih-Po Hung
b9bdec3021
[TTI][Vectorize] Migrate masked/gather-scatter/strided/expand-compress costing (NFCI) (#165532)
In #160470, there is a discussion about the possibility to explored a
general approach for handling memory intrinsics.

API changes:
- Remove getMaskedMemoryOpCost, getGatherScatterOpCost,
getExpandCompressMemoryOpCost, getStridedMemoryOpCost from
Analysis/TargetTransformInfo.
- Add getMemIntrinsicInstrCost.

In BasicTTIImpl, map intrinsic IDs to existing target implementation
until the legacy TTI hooks are retired.
- masked_load/store → getMaskedMemoryOpCost
- masked_/vp_gather/scatter → getGatherScatterOpCost
- masked_expandload/compressstore → getExpandCompressMemoryOpCost
- experimental_vp_strided_{load,store} → getStridedMemoryOpCost
TODO: add support for vp_load_ff.

No functional change intended; costs continue to route to the same
target-specific hooks.
2025-11-28 05:14:37 +00:00
Alexey Bataev
54d9d4d868 [SLP]Check if the non-schedulable phi parent node has unique operands
Need to check if the non-schedulable phi parent node has unique
operands, if the incoming node has copyables, and the node is
commutative. Otherwise, there might be issues with the correct
calculation of the dependencies.

Fixes #168589
2025-11-20 10:51:31 -08:00
Alexey Bataev
2c3aa92089 [SLP]Fix insertion point for setting for the nodes
The problem with the many def-use chain problems in SLP vectorizer are
related to the fact that some nodes reuse the same instruction as
insertion point. Insertion point is not the instruction, but the place
between instructions. To set it correctly, better to generate pseudo
instruction immediately after the last instruction, and use it as
insertion point. It resolves the issues in most cases.

Fixes #168512 #168576
2025-11-19 17:15:24 -08:00