llvm-project

Author	SHA1	Message	Date
Sushant Gokhale	c9f01f699c	[SLP][AArch64][NFC] Add more tests for SLP vectorization of div (#113876 ) Currently, we dont have much tests that show SLP outcome for integer divisions. This patch adds tests for same. In certain scenarios, for Neon, vectorization is profitable. An attempt would be made in future to improve the cost-model for the same.	2024-10-28 20:37:41 +05:30
Alexey Bataev	7152bf3bc8	[SLP]Do not create new vector node if scalars fully overlap with the existing one If the list of scalars vectorized as the part of the same vector node, no need to generate vector node again, it will be handled as part of overlapping matching. Fixes #113810	2024-10-28 06:59:41 -07:00
Matthias Braun	054c23d78f	X86: Improve cost model of fp16 conversion (#113195 ) Improve cost-modeling for x86 __fp16 conversions so the SLPVectorizer transforms the patterns: - Override `X86TTIImpl::getStoreMinimumVF` to report a minimum VF of 4 (SSE register can hold 4xfloat converted/stored to 4xf16) this is necessary as fp16 stores are neither modeled as trunc-stores nor can we mark direct Xxfp16 stores as legal as we generally expand fp16 operations). - Add missing cost entries to `X86TTIImpl::getCastInstrCost` conversion from/to fp16. Note that conversion from f64 to f16 is not supported by an X86 instruction.	2024-10-25 16:22:24 -07:00
Jonas Paulsson	aba39c3974	[System] Precommit of test for #112491 (#113704 )	2024-10-25 17:40:00 +02:00
Alexey Bataev	e914421d7f	[SLP]Do correct signedness analysis for externally used scalars If the scalars is used externally is in the root node, it may have incorrect signedness info because of the conflict with the demanded bits analysis. Need to perform exact signedness analysis and compute it rather than rely on the precomputed value, which might be incorrect for alternate zext/sext nodes. Fixes #113520	2024-10-24 08:59:24 -07:00
Alexey Bataev	d2e7ee77d3	[SLP]Do not check for clustered loads only Since SLP support "clusterization" of the non-load instructions, the restriction for reduced values for loads only should be removed to avoid compiler crash. Fixes #113516	2024-10-24 08:16:42 -07:00
Alexey Bataev	cb5046da26	[SLP]Do not ignore undefs when trying to replace with "poisonous" shuffles Need to consider undefs correctly, when trying to replace them with potentially poisonous values in shuffles. Such elements should not be silently replaced by poison values, instead complex analysis should be implemented to see if it is safe to do it. Fixes #113425	2024-10-24 07:47:23 -07:00
Alexey Bataev	b65b2b4ab6	[SLP]Expand vector to the whole register size in extracts adjustment Need to expand the number of elements to the whole register to correctly process estimation and avoid compiler crash. Fixes #113462	2024-10-23 12:04:40 -07:00
Alexey Bataev	a3508e0246	[SLP]Small buidlvector only graph should contains scalars from same block If the graph is small and has single buildvector node, all scalars instructions must be from the same basic block to prevent compiler crash. Fixes #113451	2024-10-23 10:46:38 -07:00
Alexey Bataev	4b1b51ac52	[SLP]Initial non-power-of-2 support (but still whole register) for reductions Enables initial non-power-of-2 support (but still requires number of elements, forming whole registers) for reductions. Enables extra vectorization for MultiSource/Benchmarks/7zip/7zip-benchmark, CINT2006/464.h264ref and CFP2017rate/526.blender_r (checked for SSE2) Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/112361	2024-10-21 12:25:39 -07:00
Alexey Bataev	9e03920cbf	[SLP]Ignore root gather node, when searching for reuses Root gather/buildvector node should be ignored when SLP vectorizer tries to find matching gather nodes, vectorized earlier. This node is definitely the last one in the pipeline and it does not have users. It may cause the compiler crash Fixes #113143	2024-10-21 09:16:16 -07:00
David Green	17ac10c28f	Revert "[SLP]Initial non-power-of-2 support (but still whole register) for reductions" This reverts commit 7f2e937469a8cec3fe977bf41ad2dfb9b4ce648a as it causes regressions in the tests it modifies, and undoes what was added in #100653 (which itself was a fix for a previous regression).	2024-10-21 13:37:44 +01:00
Alexey Bataev	709abacdc3	[SLP]Check that operand of abs does not overflow before making it part of minbitwidth transformation Need to check that the operand of the abs intrinsic can be safely truncated before making it part of the minbitwidth transformation. Fixes #112577	2024-10-18 13:56:19 -07:00
Alexey Bataev	825f9cb1b3	[SLP][NFC]Add a test with the incorrect casting of the abs argument, NFC	2024-10-18 13:44:57 -07:00
Alexey Bataev	e56e9dd8ad	[SLP]Fix minbitwidth emission and analysis for freeze instruction Need to add minbw emission and analysis for freeze instruction to fix incorrect signedness propagation. Fixes #112460	2024-10-18 13:36:37 -07:00
Alexey Bataev	4c4b93dcb9	[SLP][NFC]Add a test with the incorrect casting of freeze instruction operands, NFC	2024-10-18 13:29:18 -07:00
Alexey Bataev	7f2e937469	[SLP]Initial non-power-of-2 support (but still whole register) for reductions Enables initial non-power-of-2 support (but still requires number of elements, forming whole registers) for reductions. Enables extra vectorization for MultiSource/Benchmarks/7zip/7zip-benchmark, CINT2006/464.h264ref and CFP2017rate/526.blender_r (checked for SSE2) Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/112361	2024-10-18 12:50:11 -07:00
Han-Kuan Chen	12bcea3292	[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#111459 ) reference: https://github.com/llvm/llvm-project/pull/110457	2024-10-18 20:16:56 +07:00
Alexey Bataev	685bec722f	Revert "[SLP]Initial non-power-of-2 support (but still whole register) for reductions" This reverts commit 8287fa8e596d8fc8655c8df3bc99e068ad9f7d4b to investigate and fix compile time regressions reported by https://llvm-compile-time-tracker.com/compare.php?from=ec78f0da0e9b1b8e2b2323e434ea742e272dd913&to=8287fa8e596d8fc8655c8df3bc99e068ad9f7d4b&stat=instructions:u	2024-10-15 12:59:44 -07:00
Alexey Bataev	8287fa8e59	[SLP]Initial non-power-of-2 support (but still whole register) for reductions Enables initial non-power-of-2 support (but still requiresnumber of elements, forming whole registers) for reductions. Enables extra vectorization for MultiSource/Benchmarks/7zip/7zip-benchmark, CINT2006/464.h264ref and CFP2017rate/526.blender_r (checked for SSE2) Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/112361	2024-10-15 12:10:48 -04:00
Alexey Bataev	ab902ee54a	[SLP][NFC]Replace more unreachable terminators by rets, NFC	2024-10-14 07:50:07 -07:00
Alexey Bataev	91a0fecf19	[SLP][NFC]Replace unreachable instructions by rets, NFC.	2024-10-14 07:00:56 -07:00
Alexey Bataev	f9bc00e4bb	[SLP]Initial support for interleaved loads Adds initial support for interleaved loads, which allows emission of segmented loads for RISCV RVV. Vectorizes extra code for RISCV CFP2006/447.dealII, CFP2006/453.povray, CFP2017rate/510.parest_r, CFP2017rate/511.povray_r, CFP2017rate/526.blender_r, CFP2017rate/538.imagick_r, CINT2006/403.gcc, CINT2006/473.astar, CINT2017rate/502.gcc_r, CINT2017rate/525.x264_r Reviewers: RKSimon, preames Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/112042	2024-10-14 09:12:33 -04:00
Alexey Bataev	4b5018d231	[SLP]Track repeated reduced value as it might be vectorized Need to track changes with the repeated reduced value, since it might be vectorized in the next attempt for reduction vectorization, to correctly generate the code and avoid compiler crash. Fixes #111887	2024-10-10 13:41:56 -07:00
Alexey Bataev	f020bf1526	[SLP]Initial support for non-power-of-2 (but whole reg) vectorization for stores Allows non-power-of-2 vectorization for stores, but still requires, that vectorized number of elements forms full vector registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/111194	2024-10-09 15:22:44 -04:00
Alexey Bataev	9f3c55954e	[SLP]Fix loads sorting for loads from diffrent basic blocks Patch fixes lookup for loads from different basic blocks. Originally, the code checked is the main key (combined with parent basic block) was created, but did not include the key into LoadsMap. When the code looked for the load pointer in LoadsMap, it skipped check for parent basic block and could mix loads from different basic blocks (but the same underlying pointer). Currently, it does lead to any issues, since later the code compares parent basic blocks and sorts loads properly. But it increases compile time and affects compile time. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/111521	2024-10-08 16:44:16 -04:00
Alexey Bataev	a65a5feb1a	[SLP]Improve masked loads vectorization, attempting gathered loads If the vector of loads can be vectorized as masked gather and there are several other masked gather nodes, compiler can try to attempt to check, if it possible to gather such nodes into big consecutive/strided loads node, which provide better performance. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/110151	2024-10-08 16:43:10 -04:00
Philip Reames	f11568bcb0	Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457 )" This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved when landed.	2024-10-07 11:31:57 -07:00
Luke Lau	20864d2cf6	[ValueTypes][RISCV] Add v1bf16 type (#111112 ) When trying to add RISC-V fadd reduction cost model tests for bf16, I noticed a crash when the vector was of <1 x bfloat>. It turns out that this was being scalarized because unlike f16/f32/f64, there's no v1bf16 value type, and the existing cost model code assumed that the legalized type would always be a vector. This adds v1bf16 to bring bf16 in line with the other fp types. It also adds some more RISC-V bf16 reduction tests which previously crashed, including tests to ensure that SLP won't emit fadd/fmul reductions for bf16 or f16 w/ zvfhmin after #111000.	2024-10-06 22:20:51 +08:00
Han-Kuan Chen	554eaec639	[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457 )	2024-10-05 14:58:44 +08:00
Alexey Bataev	f74879cf0c	[SLP]Make PHICompare comparator follow weak strict ordering requirement Reviewers: efriedma-quic Reviewed By: efriedma-quic Pull Request: https://github.com/llvm/llvm-project/pull/110529	2024-10-04 14:23:48 -04:00
Alexey Bataev	c0dfef878e	[SLP][NFC]Add a test with potential non-power-of2 (but whole reg) vectorized stores	2024-10-04 11:22:55 -07:00
Alexey Bataev	d991e05452	[SLP]Fix compiler crash on vectorizing gatehrd loads with different types Need to check not only parents, but also types for compatible loads, when trying to build the vectorizable sequences. Fixes crash reported in https://github.com/llvm/llvm-project/pull/107461#issuecomment-2392980214	2024-10-04 08:36:57 -07:00
Elvina Yakubova	15ee17c3ce	[SLP] Move more X86 tests to common directory (#111134 ) Some of the tests from the X86 directory can be generalized to improve coverage for other architectures (cont.)	2024-10-04 13:18:56 +01:00
Alexey Bataev	133c1224de	[SLP]Fix a crash on accessing element with index -1 for reused mask with PoisonMaskElem Need to check if the index from the ReuseShuffleIndices mask is not equal to PoisonMaskElem before trying to access the element by index.	2024-10-03 08:24:05 -07:00
Alexey Bataev	c1b911c579	[SLP]Do correct signedness analysis for clustered nodes Should get the signedness info from the original scalar instructions, if possible, to correctly generate sext/zext instructions. Also, the clustered node must be assigned a gather node user info to correctly estimate its bitwidth/sign.	2024-10-02 12:56:49 -07:00
Alexey Bataev	848cb21ddc	[SLP][NFC]Add a test with the incorrect signedness info for subvector	2024-10-02 12:06:08 -07:00
Alexey Bataev	4197e732a5	[SLP]Add debug counter support Fixes #110725 Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/110734	2024-10-02 11:14:34 -07:00
Alexey Bataev	ec7266617f	Revert "[SLP]Add debug counter support" This reverts commit 67dd9d23474bd570d5befaddad0be8a5559b815b to fix https://lab.llvm.org/buildbot/#/builders/11/builds/6012	2024-10-02 10:33:27 -07:00
Alexey Bataev	67dd9d2347	[SLP]Add debug counter support Fixes #110725 Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/110734	2024-10-02 10:00:48 -07:00
Alexey Bataev	4dede756f2	[SLP]Transform nodes before building externally used values transformNodes function may create new vector nodes, so the reduced values might be vectorized later. Need to build the list of the externally used values after the transformNodes() function call to avoid compiler crash. Fixe #110787	2024-10-02 06:01:25 -07:00
Haowei Wu	948326163c	Revert "[SLP]Add debug counter support" This reverts commit f3c408d1726f6a921212faf68085f68bf8533f0c. This breaks LLVM test on debug-counter.ll	2024-10-01 16:15:30 -07:00
Haowei Wu	ccbda38b70	Revert "[SLP][NFC]Make a test target specific to avoid failures" This reverts commit 998033b3501e96035e4da7027e0a9dfc81c721ad. This breaks llvm test on debug-counter.ll	2024-10-01 16:14:45 -07:00
Alexey Bataev	998033b350	[SLP][NFC]Make a test target specific to avoid failures	2024-10-01 14:08:37 -07:00
Alexey Bataev	f3c408d172	[SLP]Add debug counter support Fixes #110725 Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/110734	2024-10-01 16:21:00 -04:00
Alexey Bataev	b16e694948	[SLP]Try to keep operand of external casts as scalars, if profitable If the cost of original scalar instruction + cast is better than the extractelement from the vector cast instruction, better to keep original scalar instructions, where possible Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/110537	2024-10-01 13:35:42 -04:00
Alexey Bataev	0dab02258a	[SLP][NFC]Add a test with external cast and extracted operand, NFC	2024-10-01 09:23:02 -07:00
Han-Kuan Chen	cc01112660	[SLP][REVEC] getTypeSizeInBits should apply to scalar type instead of FixedVectorType. (#110610 ) reference: https://github.com/llvm/llvm-project/issues/109835	2024-10-01 19:15:58 +08:00
Han-Kuan Chen	061762933b	[SLP][REVEC] Fix cost model for getBuildVectorCost with FixedVectorType ScalarTy. (#110073 ) BoUpSLP::gather always use CreateInsertVector for FixedVectorType ScalarTy.	2024-09-30 21:51:12 +08:00
Alexey Bataev	f49344e19d	[SLP]Check if number of elements forms a full register Need to check if number of elements form a full register before trying per-register permutations to avoid compiler crash	2024-09-27 12:54:56 -07:00

1 2 3 4 5 ...

1975 Commits