llvm-project

Author	SHA1	Message	Date
Han-Kuan Chen	5fc9502f19	[SLP] NFC. ShuffleInstructionBuilder::add V1->getType() is always a FixedVectorType. (#99842 ) castToScalarTyElem has a cast<VectorType>(V->getType()).	2024-07-24 01:40:24 +08:00
Alexey Bataev	3cb82f49dc	[SLP]Fix PR99899: Use canonical type instead of original vector of ptr. Use adjusted canonical integer type instead of the original ptr type to fix the crash in the TTI. Fixes https://github.com/llvm/llvm-project/issues/99899	2024-07-22 13:05:12 -07:00
Alexey Bataev	f6e01b9ece	[SLP]Do not trunc bv nodes, if the user is vectorized an requires wider type. If at least a single user of the gathered trunc'ed instruction is vectorized and requires wider type, than the trunc node, such gathers/buildvectors should not be optimized for better bitwidth.	2024-07-19 07:28:04 -07:00
Yangyu Chen	007aa6d1b2	[SLP] Increase UsesLimit to 64 (#99467 ) Since commit 82b800ecb35fb46881aa52000fa40b1b99aa654e addressed the issue #99327 , we see some performance regression (13%) on some verilator generated C++ code. This is because the UsesLimit is set to 8, which is too small for the verilator generated code. I have analyzed the need for the UsesLimit from [1] and found that the UsesLimit should be at least 64 to cover most of these cases. Thus, This patch increases the UsesLimit to 64. Link: https://github.com/llvm/llvm-project/issues/99327#issuecomment-2236052879 [1] Signed-off-by: Yangyu Chen <cyy@cyyself.name>	2024-07-19 20:32:28 +08:00
Han-Kuan Chen	39bb244a16	[SLP][REVEC] Make Instruction::Call support vector instructions. (#99317 )	2024-07-18 20:49:53 +08:00
Han-Kuan Chen	b634e057dd	[SLP][REVEC] Fix false assumption of the source for castToScalarTyElem. (#99424 ) The argument V may come from adjustExtracts, which is the vector operand of ExtractElementInst. In addition, it is not existed in getTreeEntry. The vector operand of ExtractElementInst may have a type of <1 x Ty>, ensuring that the number of elements in ScalarTy and VecTy are equal. reference: https://github.com/llvm/llvm-project/issues/99411	2024-07-18 19:54:46 +08:00
Alexey Bataev	82b800ecb3	[SLP][NFC]Limit number of the external uses analysis, NFC. BoUpSLP::buildExternalUses runs through all the users of the vectorized scalars, which may require significant amount of time, if there are too many users. Limited the analysis, if there are too many users, all of them are replaced, not individually.	2024-07-17 14:12:22 -07:00
Alexey Bataev	c5c1bd164f	[SLP]Improve minbitwidth analysis for trun'ed gather nodes. If the gather node is trunc'ed, better to trunc scalars and then gather them rather than gather and then trunc. Trunc for scalars is free in most cases. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/99072	2024-07-17 07:41:00 -07:00
Alexey Bataev	05b067b5f9	Revert "[SLP]Improve minbitwidth analysis for trun'ed gather nodes." This reverts commit d3d2f9a4208eedbd2f372c34725ab61c3f4d3aed to fix buildbot https://lab.llvm.org/buildbot/#/builders/92/builds/1880.	2024-07-17 07:31:27 -07:00
Alexey Bataev	d3d2f9a420	[SLP]Improve minbitwidth analysis for trun'ed gather nodes. If the gather node is trunc'ed, better to trunc scalars and then gather them rather than gather and then trunc. Trunc for scalars is free in most cases. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/99072	2024-07-17 07:29:02 -07:00
Alexey Bataev	b05ccaf451	Revert "[SLP]Improve minbitwidth analysis for trun'ed gather nodes." This reverts commit 6425f2d66740b84fc3027b649cd4baf660c384e8 to fix the buildbost issues reported in https://lab.llvm.org/buildbot/#/builders/95/builds/1404.	2024-07-17 05:51:54 -07:00
Han-Kuan Chen	1813ffd6b2	[SLP][REVEC] Make SLP support revectorization (-slp-revec) and add simple test. (#98269 ) This PR will make SLP support revectorization. Add an option -slp-revec to control the functionality. reference: https://discourse.llvm.org/t/rfc-make-slp-vectorizer-revectorize-vector-instructions/79436	2024-07-17 20:14:12 +08:00
Alexey Bataev	6425f2d667	[SLP]Improve minbitwidth analysis for trun'ed gather nodes. If the gather node is trunc'ed, better to trunc scalars and then gather them rather than gather and then trunc. Trunc for scalars is free in most cases. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/99072	2024-07-17 07:17:25 -04:00
Alexey Bataev	15915c06d5	[SLP]Do not vectorize small (<=2) buildvector/buildvalue sequences with MaxVF==true. If MaxVFOnly for buildvector/buildvalue vectorization is set to true and the total number of elements to vectorize is <= 2, better to try to vectorize reductions at first, which may produce larger tree (reductions have a limit of at least 4 elements to vectorize). Smaller buildvector/buildvalue sequence will be attempted to vectorize later, with MaxVFOnly set to false. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/98957	2024-07-16 12:45:58 -04:00
Alexey Bataev	8ff233f4f1	[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats. The patch enables detection of minnum/maxnum patterns for float point instruction, represented as select/cmp. Also, enables better cost estimation for integer min/max patterns since the compiler starts to estimate the scalars separately. Reviewers: nikic, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/98570	2024-07-16 09:42:08 -07:00
Alexey Bataev	c3540d0b6b	Revert "[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats." This reverts commit c7aac38c29f564bc48f7cfb71d3b3b8b482c873b to fix crashes reavealed by the buildbot in https://lab.llvm.org/buildbot/#/builders/168/builds/1104.	2024-07-16 05:59:59 -07:00
Alexey Bataev	c7aac38c29	[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats. The patch enables detection of minnum/maxnum patterns for float point instruction, represented as select/cmp. Also, enables better cost estimation for integer min/max patterns since the compiler starts to estimate the scalars separately. Reviewers: nikic, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/98570	2024-07-16 08:14:27 -04:00
Alexey Bataev	beccecaacd	[SLP]Fix PR98838: do no replace condition of select-based logical op by poison. If the reduction operation is a select-based logical op, the condition should be replaced by the poison, better to replace by the non-poisoning constant to prevent poison propagation in the vector code. Fixes https://github.com/llvm/llvm-project/issues/98838	2024-07-15 07:27:54 -07:00
Alexey Bataev	9e261c5bee	[SLP]Do not salvage debug info from instructions, marked for deletion already. If the instruction was processed already for the deletion, no need to process it second time, it may cause compiler crash.	2024-07-12 08:08:50 -07:00
Alexey Bataev	01a9888694	[SLP][NFC]Add isGather() function and use it instead direct comparison, NFC.	2024-07-11 11:56:32 -07:00
Alexey Bataev	3742c2a83c	[SLP]Use stored signedness after minbitwidth analysis. Need to used stored signedness info for the root node instead of recalculating it after the vectorization, which may lead to a compiler crash.	2024-07-10 03:58:00 -07:00
Han-Kuan Chen	ac299ed2c7	[SLP] Provide an universal interface for FixedVectorType::get. NFC. (#96845 ) SLP vectorizes scalar type to vector type. In the future, we will try to make SLP vectorizes vector type to vector type. We add a getWidenedType as a helper function. For example, SLP will make the following code %v0 = load i32, ptr %in0, align 4 %v1 = load i32, ptr %in1, align 4 %v2 = load i32, ptr %in2, align 4 %v3 = load i32, ptr %in3, align 4 into a load <4 x i32>. The ScalarTy is i32 and VF is 4. In the future, SLP will make the following code %v0 = load <4 x i32>, ptr %in0, align 4 %v1 = load <4 x i32>, ptr %in1, align 4 %v2 = load <4 x i32>, ptr %in2, align 4 %v3 = load <4 x i32>, ptr %in3, align 4 into a load <16 x i32>. The ScalarTy is <4 x i32> and VF is 4. reference: https://discourse.llvm.org/t/rfc-make-slp-vectorizer-revectorize-vector-instructions/79436	2024-07-10 11:50:35 +08:00
Alexey Bataev	af21bc1917	[SLP]Fix a crash on attempt to revectorize vectorized phi. If the PHI node is vectorized during vectorization of its operands, no need to try to vectorize its operands once again.	2024-07-09 14:11:08 -07:00
Alexey Bataev	822a818786	[SLP][NFC]Add comments for the code, NFC.	2024-07-09 10:06:34 -07:00
Alexey Bataev	a988821123	[SLP]Keep the original order in the reductions. The patch tries to keep the original order of the instruction in the reductions. Previously, two first instructions were switched, giving reverse order. The first step to support of the ordered reductions. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/98025	2024-07-09 12:26:42 -04:00
Alexey Bataev	2cba218ca5	[SLP]Fix PR98133: Inserting PHI after debug-records! The phi-node-to-be-deleted still should be inserted as the first instruction in the block to avoid random compiler crashes. Fixes https://github.com/llvm/llvm-project/issues/98133	2024-07-09 05:44:45 -07:00
Alexey Bataev	f5ee07a1b5	[SLP]Improve instruction reordering mode detection. The "instruction" reordering mode should be selected only if there are compatible instructions in other operands, which can be reordered. Otherwise, better to select splat reordering mode. Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12383340.00 12383324.00 -0.0% Some 4x operations get replaced by 8x. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/97485	2024-07-08 16:01:55 -04:00
Alexey Bataev	385118644c	[SLP]Remove operands upon marking instruction for deletion. If the instruction is marked for deletion, better to drop all its operands and mark them for deletion too (if allowed). It allows to have more vectorizable patterns and generate less useless extractelement instructions. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/97409	2024-07-08 07:56:48 -07:00
Alexey Bataev	4c47b41771	[SLP]Allow matching and shuffling of extractelement vector operands with different VF. Allows better codegen with the free resizing of small VF vector operands and then regular shuffling of the operands of the same size and simplifies the code. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/97414	2024-07-08 09:27:08 -04:00
tcwzxx	c2fe75f99c	Make the logic for checking scatter vectorized nodes of GEP clearer (#97826 ) There is no functional change. Authored-by: zhizhixu <zhizhixu@tencent.com>	2024-07-08 06:08:04 -04:00
Kazu Hirata	75bc20ff89	[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914 )	2024-07-07 08:23:41 +09:00
Jon Roelofs	d3a76b03d8	[llvm][SLPVectorizer] Fix a bad cast assertion (#97621 ) Fixes: rdar://128092379	2024-07-03 16:25:32 -07:00
Alexey Bataev	873c3f7e78	Revert "[SLP]Remove operands upon marking instruction for deletion." This reverts commit bbd52dd44ceee80e3b6ba6a9b2bd8ee9a9713833 to fix a crash revealed in https://lab.llvm.org/buildbot/#/builders/4/builds/505	2024-07-03 13:05:17 -07:00
Alexey Bataev	bbd52dd44c	[SLP]Remove operands upon marking instruction for deletion. If the instruction is marked for deletion, better to drop all its operands and mark them for deletion too (if allowed). It allows to have more vectorizable patterns and generate less useless extractelement instructions. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/97409	2024-07-03 15:11:18 -04:00
Alexey Bataev	4eecf3c650	[SLP]Reorder buildvector/reduction vectorization and fuse the loops. Currently SLP vectorizer tries at first to find reduction nodes, and then vectorize buildvector sequences. Need to try to vectorize wide buildvector sequences at first and only then try to vectorize reductions, and then smaller buildvector sequences. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/96943	2024-07-03 14:36:30 -04:00
Gabriel Baraldi	380beaec86	Fix potential crash in SLPVectorizer caused by missing check (#95937 ) I'm not super familiar with this code, but it seems that we were just missing a check. The original code that triggered this did not have uselistorders but llvm-reduce created them and it reproduces the same issue in a way more compact way. Fixes https://github.com/llvm/llvm-project/issues/95016	2024-07-02 08:15:51 -04:00
Youngsuk Kim	2051736f7b	[llvm][Transforms] Avoid 'raw_string_ostream::str' (NFC) Since `raw_string_ostream` doesn't own the string buffer, it is desirable (in terms of memory safety) for users to directly reference the string buffer rather than use `raw_string_ostream::str()`. Work towards TODO comment to remove `raw_string_ostream::str()`.	2024-06-30 09:03:29 -05:00
Alexey Bataev	d70963a762	[SLP]Fix the cost of the adjusted extracts in per-register analysis. Previous patch did not pass the list of the extract indices by reference, so the compiler just ignored them. Pass indices by reference and fix the per-register analysis. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/96808	2024-06-28 14:33:08 -07:00
Alexey Bataev	a9c12e481b	Revert "[SLP]Fix the cost of the adjusted extracts in per-register analysis." This reverts commit 784152056ea40a800a8fd9f4157a428dfb7a6de8 to fix buildbots issues reported in https://lab.llvm.org/buildbot/#/builders/4/builds/315 and https://lab.llvm.org/buildbot/#/builders/35/builds/481	2024-06-28 13:41:51 -07:00
Alexey Bataev	784152056e	[SLP]Fix the cost of the adjusted extracts in per-register analysis. Previous patch did not pass the list of the extract indices by reference, so the compiler just ignored them. Pass indices by reference and fix the per-register analysis. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/96808	2024-06-28 15:49:47 -04:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Alexey Bataev	6f582b7ed3	[SLP][NFC]Remove extra check for VU.	2024-06-26 05:39:37 -07:00
Alexey Bataev	0280f97b36	[SLP]Fix PR95925: extract vectorized index of the potential buildvector sequence. If the vectorized scalar is not the insert value in the buildvector sequence but the index, it should be always extracted.	2024-06-25 14:07:51 -07:00
Alexey Bataev	228c2e1473	[SLP]Fix incorrect promotion of nodes before shuffling. If the base node is signed, but some values are unsigned, still the whole node should be considered signed. Also, an extra bitwidth analysis should be performed, when estimating the minimal bitwidth.	2024-06-25 13:39:28 -07:00
Han-Kuan Chen	de7c1396f2	[SLP] NFC. Refactor and add getAltInstrMask help function. (#94709 ) Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2024-06-26 00:42:38 +08:00
Nikita Popov	8263bec533	[SLP] Use poison instead of undef in reorderScalars() (#96619 ) -1 mask elements are specified to return poison rather than undef nowadays , so update the reorderScalars() implementation to match.	2024-06-25 14:23:40 +02:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Simon Pilgrim	f9fc6f6d75	[SLP] Remove dead initialization noticed by static analyser. NFC.	2024-06-21 17:42:01 +01:00
Han-Kuan Chen	be339fd99d	[SLP] NFC. Reduce redundant assignment. (#96149 )	2024-06-20 20:09:28 +08:00

1 2 3 4 5 ...

1795 Commits