llvm-project

Author	SHA1	Message	Date
Florian Hahn	05a2b146fb	[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870 ) This patch restructures Find(First\|Last)IV handling. Instead of differentiating between FindLast, FindFirstIV and FindLastIV up front, this patch simplifies the logic in IVDescriptor to just identify the FindLast pattern up-front. It then adds a new VPlan transformation to optimize FindLast reductions to FindIV reductions if there is a suitable sentinel value. Find(Last\|First)IV recurrence kinds to a single FindIV kind. This is simpler and more accurate, given selecting the first/last induction of the final IV reduction is directly controlled by the corresponding recurrence kind of the ComputeReductionResult. The new structure also allows further optimizations, like vectorizing FindLastIV with another boolean reduction that tracks if the condition in the loop was ever true, if there is no suitable sentinel value. PR: https://github.com/llvm/llvm-project/pull/177870	2026-02-05 13:57:20 +00:00
Alexey Bataev	46a38488a4	[SLP]Disable modeling disjoint reduction or as bitcast for big endian Big endian targets cannot be modeled as bitcast, need to support it as a reversion/bswap instead, just disabling it for now.	2026-02-03 06:16:25 -08:00
Ryan Buchner	e5b99502d7	[SLP] Avoid adding duplicate VFs into vectorizeStores()::CandidateVFs (#179296 ) Small compile time improvement: ``` stage1-O3: (-0.01%) stage1-ReleaseThinLTO (-0.00%) stage1-ReleaseLTO-g (-0.01%) stage1-O0-g (-0.00%) stage1-aarch64-O3 (+0.01%) stage1-aarch64-O0-g (-0.02%) stage2-O3 (-0.00%) stage2-O0-g (-0.03%) stage2-clang (+0.00%) ``` Also changes/removes a few comments for clarity.	2026-02-02 11:03:27 -08:00
Ryan Buchner	b936771eea	[SLP][NFC] Refactor vectorizeStores::RangeSizes (#177241 ) Currently `RangeSizes` is used to allow us to skip trying to vectorize clearly unprofitable trees by caching prior attempts `TreeSizes`. This PR refactors that logic to simplify and improve readability. This will make it easier to handle the strided stores. Switches RangeSizes to use `first` as the location to lookup values from, and `second` as the location to store values to. `first` gets updated by `second` at the appropriate times to match the behavior prior to this change.	2026-01-30 10:25:15 -08:00
Alexey Bataev	b73122d5b7	[SLP]Cast incoming value to a propr type for int nodes, bitcasted to fp Before casting the value to FP type, need to check, if the type for reduced during minbitwidth analysis and need to restore the original source type to generate correct bitcast operation. Fixes #178884	2026-01-30 08:51:03 -08:00
Alexey Bataev	2ea77ed013	[SLP]Support for bswap pattern for bytes-based disjoint or reductions If the reduction forms reversed bitcast, we can represent it as a bitcast + bswap, if the source elements are byte sized Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/178513	2026-01-29 11:28:01 -05:00
Alexey Bataev	f6fe6fc86a	[SLP]Do not vectorize subtrees of the split node, marked as gathers. If the split node was marked as gather/buildvector nodes, the vectorizer should not vectorize its subtrees, which are marked as deleted.	2026-01-28 17:44:38 -08:00
Alexey Bataev	5413a22e79	[SLP] Reordered disjoint or reduction of shl(zext, (0, stride, 2* stride)) modelled as bitcast Added support for reorder reduction of shl(zext)-like construct. Such constructs are modelled currently as shuffle + bitcast. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/178292	2026-01-28 11:15:54 -05:00
Alexey Bataev	68450ba210	[SLP]Support for tree throttling in SLP graphs with gathered loads Gathered loads forming DAG instead of trees in SLP vectorizer. When doing the throttling analysis for such graphs, need to consider partially matched gathered loads DAG nodes and consider extract and/or gather operations and their costs. The patch adds this analysis and allows cutting off the expensive sub-graphs with gathered loads. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/177855 Recommit after revert in d733771113339608aff6002d1fa89aaf4a51c502, which was related to a crash in SelectionDAG	2026-01-28 07:19:09 -08:00
Soumik15630m	51845a53fd	[SLP] Fix crash on extractelement with out-of-bounds index.....Fixes … (#176918 ) …The cose modeling logic was attempting to set a bit in APInt for an out-of-bounds index, causing an assertion failure. This patch ignores OOB indices as they produce poison- which is already handled. Fixes #176780 this is the same test result which produces this bug <img width="1600" height="964" alt="image" src="https://github.com/user-attachments/assets/80593902-9d15-4e18-850b-a558bca8518e" />	2026-01-28 06:08:03 -05:00
Nico Weber	d733771113	Revert "[SLP]Support for tree throttling in SLP graphs with gathered loads" This reverts commit 0666a777ec8138f58ebc7fc41a2fb8097328308a. Makes clang assert, see repro at https://github.com/llvm/llvm-project/pull/177855#issuecomment-3808529832	2026-01-27 21:01:56 -05:00
serge-sans-paille	84cccfc828	[perf] Replace copy-assign by move-assign in llvm/lib/Transforms/* (#178178 )	2026-01-27 16:29:35 +00:00
Alexey Bataev	5786ca7bd0	[SLP]Model disjoint or reduction of shl(zext, (0, stride, 2* stride)) as bitcast Patch models the cost and lowering of disjoint or reduction of shl(zext, (0, stride, 2* stride)) as bitcast via modeling as combined ops. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/177041	2026-01-26 15:59:58 -05:00
Alexey Bataev	0666a777ec	[SLP]Support for tree throttling in SLP graphs with gathered loads Gathered loads forming DAG instead of trees in SLP vectorizer. When doing the throttling analysis for such graphs, need to consider partially matched gathered loads DAG nodes and consider extract and/or gather operations and their costs. The patch adds this analysis and allows cutting off the expensive sub-graphs with gathered loads. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/177855	2026-01-25 12:22:47 -05:00
Rahul Joshi	358db292cc	[NFC] Fix typo `instrinsic` -> `intrinsic` (#177627 )	2026-01-23 13:48:30 -08:00
Alexey Bataev	d64d3735ab	[SLP]Correctly handle vector nodes, coming from same incoming blocks in PHI nodes If multiple nodes are generated from same PHI node for the same block, still need to vectorize vector nodes, even if the value for the incoming block was already emitted. Fixes #177124	2026-01-21 12:59:21 -08:00
Ryan Buchner	9c2124e0f6	[NFC][SLP] Fix typo in assertion (#177079 )	2026-01-21 09:39:12 -08:00
Alexey Bataev	3dc5259bc8	[SLP]Do not build bundle for copyables, with parents used in PHI node If the copyables have parents, used in PHI nodes, this causes complex schedulable/non-schedulable dependecies, which require complex processing, but with small profitability. Cut such case early for now to prevent compiler crashes and compile time blow up. Fixes #176658	2026-01-18 13:37:51 -08:00
David Stone	74379c2d44	[llvm][clang] Remove `llvm::OwningArrayRef` (#169126 ) `OwningArrayRef` has several problems. The naming is strange: `ArrayRef` is specifically a non-owning view, so the name means "owning non-owning view". It has a const-correctness bug that is inherent to the interface. `OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This means that the following code compiles: ```c++ void const_incorrect(llvm::OwningArrayRef<int> const a) { a[0] = 5; } ``` It's surprising for a non-reference type to allow modification of its elements even when it's declared `const`. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called: ```c++ void memory_corruption(llvm::OwningArrayRef<int> a) { a.consume_front(); } ``` This happens because `MutableArrayRef::consume_front` modifies the internal data pointer to advance the referenced array forward. That's not an issue for `MutableArrayRef` because it's just a view. It is an issue for `OwningArrayRef` because that pointer is passed as the argument to `delete[]`, so when it's modified by advancing it forward it ceases to be valid to `delete[]`. From there, undefined behavior occurs. It is less convenient than `llvm::SmallVector` for construction. By combining the `size` and the `capacity` together without going through `std::allocator` to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an `OwningArrayRef` of the appropriate size, then fill in the data. This has one of two consequences: 1. If `T` is a class type, we have to first default construct all of the elements when we construct `OwningArrayRef` and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible. 2. If `T` is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory. Using `llvm::SmallVector`, by constrast, has well-known constructors that can fill in the data that we actually want on construction. `OwningArrayRef` has slightly different performance characteristics than `llvm::SmallVector`, but the difference is minimal. The first difference is a theoretical negative for `OwningArrayRef`: by implementing in terms of `new[]` and `delete[]`, the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to `std::allocator` is not yet taken advantage of (see https://github.com/llvm/llvm-project/issues/68365) The second difference is slightly in favor of `OwningArrayRef`: `sizeof(llvm::SmallVector<T>) == sizeof(void ) 3` on pretty much any implementation, whereas `sizeof(OwningArrayRef) == sizeof(void ) 2` which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a `DynamicArray` type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through `std::allocator` to avoid the pitfalls outlined earlier.	2026-01-17 21:06:25 -07:00
Gabriel Baraldi	72a20b8e29	[SLPVectorizer] Check std::optional coming out of getPointersDiff (#175784 ) Fixes https://github.com/llvm/llvm-project/issues/175768 There are other unchecked uses std::optional in this pass but I couldn't figure out a test that triggers them	2026-01-15 09:07:13 -06:00
Alexey Bataev	c322a0c462	[SLP]Do not throttle nodes with split parents, if any of scalars is used in more than one split nodes If the the node to throttle is a vector node, which is used in split node, and at least one scalar of such a node is used in many split nodes, such vector node should be throttled. otherise there might be wrong def-use chain, which crashes the compiler. Fixes #175967	2026-01-15 03:50:45 -08:00
Graham Hunter	2abd6d6d7a	[LV] Vectorize conditional scalar assignments (#158088 ) Based on Michael Maitland's previous work: https://github.com/llvm/llvm-project/pull/121222 This PR uses the existing recurrences code instead of introducing a new pass just for CSA autovec. I've also made recipes that are more generic.	2026-01-14 14:59:18 +00:00
Ramkumar Ramachandra	d69335bac9	[LLVM] Clean up code using [not_]equal_to (NFC) (#175824 ) Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner code.	2026-01-13 21:19:39 +00:00
Alexey Bataev	a96cda0e33	[SLP]Update deps for copyables operands, if the user is used several times in node If the user instruction is used several times in the node, and in one cases its operand is copyable, but in another is not, need to check all operands to be sure we do not miss scheduling	2026-01-09 15:18:31 -08:00
Alexey Bataev	125a53ce59	Revert "[SLP]Update deps for copyables operands, if the user is used several times in node" This reverts commit 6e1acd061e74f44df6d53d54c78d1e50790456a8 to fix crashes detected in https://lab.llvm.org/buildbot/#/builders/25/builds/14678.	2026-01-08 14:15:25 -08:00
Alexey Bataev	6e1acd061e	[SLP]Update deps for copyables operands, if the user is used several times in node If the user instruction is used several times in the node, and in one cases its operand is copyable, but in another is not, need to check all operands to be sure we do not miss scheduling	2026-01-08 12:50:32 -08:00
Alex Bradbury	3ae71d30be	[SLP] Use ConstantInt::getSigned for stride argument to strided load/store intrinsics (#175007 ) strided-stores-vectorized.ll crashes for RV32 without fixing the relevant logic in vectorizeTree, because the argument can't be represented as a 32-bit unsigned value: ``` llvm::APInt::APInt(unsigned int, uint64_t, bool, bool): Assertion `llvm::isUIntN(BitWidth, val) && "Value is not an N-bit unsigned value"' failed. ``` It is intended to be signed, so we simply use ConstantInt::getSigned instead. This fixes other stride-related instances in the file as well. For further context, this change is part of unblocking rv32gcv llvm-test-suite in CI.	2026-01-08 16:45:02 +00:00
Alexey Bataev	9fb45c5959	[SLP]Do not generate extractelement subnodes with the same indeces The compiler should not generate subvectors with the same extractelement instructions, it may cause a crash and leads to inefficient vectorization. Fixes #174773	2026-01-08 07:23:06 -08:00
Alexey Bataev	39456e4226	[SLP]Do not increment dep count for non-schedulable nodes with non-schedulable parents If the node is non-scedulable, all instructions are used outside only and parent is non-schedulable non-phi node, the dependency count should be increased for such nodes Fixes #174599	2026-01-07 10:26:19 -08:00
Ryan Buchner	f180d4bb46	[SLP] Report the correct operand to getArithmeticInstrCost() when duplicated scalars (#174442 ) Before, we were selecting the wrong operand in cases when Scalars contained duplicate values. Stems from #135797. Using: `opt -passes=slp-vectorizer -mtriple=riscv64 -mattr=+v t.ll` ``` target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128" target triple = "riscv64" define void @foo(ptr noalias %A, ptr noalias %B) { entry: %0 = load i32, ptr %B %add = add nsw i32 %0, 1 store i32 %add, ptr %A %arrayidx.1 = getelementptr inbounds nuw i8, ptr %B, i64 4 %1 = load i32, ptr %arrayidx.1 %add.1 = add nsw i32 %1, 1 %arrayidx2.1 = getelementptr inbounds nuw i8, ptr %A, i64 4 store i32 %add.1, ptr %arrayidx2.1 %arrayidx.2 = getelementptr inbounds nuw i8, ptr %B, i64 8 %2 = load i32, ptr %arrayidx.2 %add.2 = add nsw i32 %2, 1 %arrayidx2.2 = getelementptr inbounds nuw i8, ptr %A, i64 8 store i32 %add.2, ptr %arrayidx2.2 %arrayidx.3 = getelementptr inbounds nuw i8, ptr %B, i64 12 %arrayidx2.3 = getelementptr inbounds nuw i8, ptr %A, i64 12 store i32 %add, ptr %arrayidx2.3 %arrayidx.4 = getelementptr inbounds nuw i8, ptr %B, i64 16 %4 = load i32, ptr %arrayidx.4 %add.4 = add nsw i32 %4, 1 %arrayidx2.4 = getelementptr inbounds nuw i8, ptr %A, i64 16 store i32 %add.4, ptr %arrayidx2.4 %arrayidx.5 = getelementptr inbounds nuw i8, ptr %B, i64 20 %5 = load i32, ptr %arrayidx.5 %add.5 = add nsw i32 %5, 1 %arrayidx2.5 = getelementptr inbounds nuw i8, ptr %A, i64 20 store i32 %add.5, ptr %arrayidx2.5 %arrayidx.6 = getelementptr inbounds nuw i8, ptr %B, i64 24 %6 = load i32, ptr %arrayidx.6 %add.6 = add nsw i32 %6, 1 %arrayidx2.6 = getelementptr inbounds nuw i8, ptr %A, i64 24 store i32 %add.6, ptr %arrayidx2.6 %arrayidx.7 = getelementptr inbounds nuw i8, ptr %B, i64 28 %7 = load i32, ptr %arrayidx.7 %add.7 = add nsw i32 %7, 1 %arrayidx2.7 = getelementptr inbounds nuw i8, ptr %A, i64 28 store i32 %add.7, ptr %arrayidx2.7 ret void } ``` The following trace is produced, note the wrong operand is used for `Idx > 2` Before: ``` GetScalarCost(), Idx=0 UniqueValues[Idx]: %add = add nsw i32 %0, 1 Op1: %0 = load i32, ptr %B, align 4 GetScalarCost(), Idx=1 UniqueValues[Idx]: %add.1 = add nsw i32 %1, 1 Op1: %1 = load i32, ptr %arrayidx.1, align 4 GetScalarCost(), Idx=2 UniqueValues[Idx]: %add.2 = add nsw i32 %2, 1 Op1: %2 = load i32, ptr %arrayidx.2, align 4 GetScalarCost(), Idx=3 UniqueValues[Idx]: %add.4 = add nsw i32 %3, 1 Op1: %0 = load i32, ptr %B, align 4 GetScalarCost(), Idx=4 UniqueValues[Idx]: %add.5 = add nsw i32 %4, 1 Op1: %3 = load i32, ptr %arrayidx.4, align 4 GetScalarCost(), Idx=5 UniqueValues[Idx]: %add.6 = add nsw i32 %5, 1 Op1: %4 = load i32, ptr %arrayidx.5, align 4 GetScalarCost(), Idx=6 UniqueValues[Idx]: %add.7 = add nsw i32 %6, 1 Op1: %5 = load i32, ptr %arrayidx.6, align 4 ``` After: ``` GetScalarCost(), Idx=0 UniqueValues[Idx]: %add = add nsw i32 %0, 1 Op1: %0 = load i32, ptr %B, align 4 GetScalarCost(), Idx=1 UniqueValues[Idx]: %add.1 = add nsw i32 %1, 1 Op1: %1 = load i32, ptr %arrayidx.1, align 4 GetScalarCost(), Idx=2 UniqueValues[Idx]: %add.2 = add nsw i32 %2, 1 Op1: %2 = load i32, ptr %arrayidx.2, align 4 GetScalarCost(), Idx=3 UniqueValues[Idx]: %add.4 = add nsw i32 %3, 1 Op1: %3 = load i32, ptr %arrayidx.4, align 4 GetScalarCost(), Idx=4 UniqueValues[Idx]: %add.5 = add nsw i32 %4, 1 Op1: %4 = load i32, ptr %arrayidx.5, align 4 GetScalarCost(), Idx=5 UniqueValues[Idx]: %add.6 = add nsw i32 %5, 1 Op1: %5 = load i32, ptr %arrayidx.6, align 4 GetScalarCost(), Idx=6 UniqueValues[Idx]: %add.7 = add nsw i32 %6, 1 Op1: %6 = load i32, ptr %arrayidx.7, align 4 ```	2026-01-05 22:25:25 +00:00
Alexey Bataev	f985e1a113	[SLP]Better copyable vectorization for stores with non-instructions (#174249 )	2026-01-03 17:05:55 -05:00
Victor Chernyakin	c438773432	[LLVM][ADT] Migrate users of `make_scope_exit` to CTAD (#174030 ) This is a followup to #173131, which introduced the CTAD functionality.	2026-01-02 20:42:56 -08:00
Mikhail Gudim	3572e62991	[SLPVectorizer] Widen rt stride loads (#162336 ) Suppose we are given pointers of the form: `%b + x * %s + y * %c_i` where `%c_i`s are constants and %s is a run-time fixed value. If the pointers can be rearranged as follows: ``` %b + 0 * %s + 0 %b + 0 * %s + 1 %b + 0 * %s + 2 ... %b + 0 * %s + w %b + 1 * %s + 0 %b + 1 * %s + 1 %b + 1 * %s + 2 ... %b + 1 * %s + w ... ``` It means that the memory can be accessed with a strided loads of width `w` and stride `%s`. This is motivated by x264 benchmark.	2026-01-02 17:06:11 -05:00
Alexey Bataev	8d75f97662	[SLP]Consider split node as potential reduction root Need to check the first split node as a potential reduction root to prevent compiler crash	2026-01-02 06:42:44 -08:00
Alexey Bataev	a0be4724a9	[SLP] Support for copyables in the reduced values (#153589 ) Currently reductions can handles only same/alternate instructions, skipping potential support for copyables. Patch adds support for copyables in the reduced values. Recommit after revert in 1febc3f088ef444af378c0a90aaba2195c30472b	2026-01-01 13:31:13 -08:00
Alexey Bataev	1febc3f088	Revert "[SLP] Support for copyables in the reduced values (#153589 )" This reverts commit 831bb12a30dbbbf69930c11846a7b62b33e0f0db to fix buildbot https://lab.llvm.org/buildbot/#/builders/224/builds/1205	2026-01-01 08:48:40 -08:00
Alexey Bataev	831bb12a30	[SLP] Support for copyables in the reduced values (#153589 ) Currently reductions can handles only same/alternate instructions, skipping potential support for copyables. Patch adds support for copyables in the reduced values.	2026-01-01 11:31:28 -05:00
Alexey Bataev	27cf32dafd	[SLP]Fix def-after-use crash for gathered split nodes If the split node is marked as a gather node after non-profitable analysis, need to exclude it from the list of split nodes and include into the list of gather/buildvector nodes Fixes report from https://github.com/llvm/llvm-project/pull/162018#issuecomment-3701928745	2025-12-31 14:12:09 -08:00
Alexey Bataev	55e0b928b5	[SLP]Consider deleted/gathered nodes, when deciding to erase extractelement If any user of the extractelement instruction is part of the node to be deleted/gathered, such extractelements instructions should not be considered for deletion. Fixes #174020	2025-12-31 12:58:42 -08:00
Alexey Bataev	2541b1870e	[SLP]Mark and incompatible for 'xor %a, 0' operations Xor with 0 is incompatible with and, which resulst in all zero instead of %a https://alive2.llvm.org/ce/z/oEVETS Fixes #174041	2025-12-31 08:30:50 -08:00
Alexey Bataev	1a8f5fa823	[SLP]Exclude non-profitable subtrees. Initial support for SLP tree throttling. Trims non-profitable subtrees, trying to maximize perf gains. Does not support trees with gathered loads yet, since they are not quite trees, but graphs. Analysis should be added later. Reviewers: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/162018 Recommit after revert in 6ec2ec4826b51d7d809fe08b36883a78d7dc0b98 with a fix	2025-12-30 09:32:05 -08:00
Alexey Bataev	6ec2ec4826	Revert "[SLP]Exclude non-profitable subtrees." This reverts commit 79472d366591a39a453c186cf031dda874ddf728 to fix a bug reported in https://github.com/llvm/llvm-project/pull/162018#pullrequestreview-3617073149	2025-12-30 05:59:07 -08:00
Alexey Bataev	79472d3665	[SLP]Exclude non-profitable subtrees. Initial support for SLP tree throttling. Trims non-profitable subtrees, trying to maximize perf gains. Does not support trees with gathered loads yet, since they are not quite trees, but graphs. Analysis should be added later. Reviewers: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/162018	2025-12-29 13:16:37 -05:00
Alexey Bataev	ab450597da	[SLP]Do not swap RHS, if it is used in bool op, used as a second operand in a reduction If the RHS operand is used as a first operand in the bool reduction op, used as a second operand in the reduction ops, still need to use this RHS as RHS, not as LHS https://alive2.llvm.org/ce/z/pmc2YJ Fixes #173796	2025-12-28 13:33:17 -08:00
Alexey Bataev	d9ce80db7a	[SLP]FIx order of bool logical ops, if the right op is used in the first reduction operarion If the LHS of the first reduction op is not a first operand, but RHS is, and RHS is the second operand of the first reductoin op, still need to emit RHS as a second reduction operand, though without freeze of the LHS operand https://alive2.llvm.org/ce/z/2_JLBu Fixes #173784	2025-12-28 11:52:44 -08:00
Alexey Bataev	42ea774aa6	[SLP]Enable float point math ops as copyables elements. Patch enables support for float point math operations as base instructions for copyable elements. It also fixes some scheduling issues, found during testing Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/169857 Recommit after reverts in 9008922707915a6632fb74ed301bce11d8775e2a and c2441689830fcb2588673dedba98da1219a2fb9e. c2441689830fcb2588673dedba98da1219a2fb9e was caused by other issues, not related to this patch directly	2025-12-26 11:55:58 -08:00
Alexey Bataev	571819cb79	[SLP]Recalculate dependencies for all cleared entries Need to recalculate the dependencies for all cleared items to avoid a crash, if the entry is used in other vector nodes Fixes #173469	2025-12-26 11:17:14 -08:00
Alexey Bataev	a08cc6e0d5	Revert "[SLP]Recalculate dependencies for all cleared entries" This reverts commit 2568ec6cb29da3db5bd7c848ec53a673c1431aea to investigate crashes reported in `2568ec6cb2 (commitcomment-173523022)`.	2025-12-26 06:55:33 -08:00
Alexey Bataev	c244168983	Revert "[SLP]Enable float point math ops as copyables elements." This reverts commit 48be4d07c3ca045fe831cbdf216631202c55cd62 to investigate crashes reported in `2568ec6cb2 (commitcomment-173523022)`.	2025-12-26 06:55:32 -08:00
Alexey Bataev	48be4d07c3	[SLP]Enable float point math ops as copyables elements. Patch enables support for float point math operations as base instructions for copyable elements. It also fixes some scheduling issues, found during testing Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/169857 Recommit after revert in 9008922707915a6632fb74ed301bce11d8775e2a	2025-12-25 12:37:01 -08:00

1 2 3 4 5 ...

2444 Commits