llvm-project

Author	SHA1	Message	Date
Craig Topper	7ec4f6094e	[InstCombine] Infer disjoint flag on Or instructions. (#72912 ) The disjoint flag was recently added to IR in #72583 We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.	2023-12-02 14:11:12 -08:00
Craig Topper	03d4a9d94d	[InstCombine] Set disjoint flag when turning Add into Or. (#72702 ) The disjoint flag was recently added to IR in #72583	2023-11-27 12:54:11 -08:00
Yingwei Zheng	6d667d4b26	[InstCombine] Combine const GEP chains This patch reverts rGae739aefd7473517d3f08b5c8d08a66c7f469198 to address performance regressions reported by our [CI](https://github.com/dtcxzyw/llvm-ci/issues/137) after rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. For example: ``` define ptr @const_gep_chain(ptr %p, i64 %a) { %p1 = getelementptr inbounds i8, ptr %p, i64 %a %p2 = getelementptr inbounds i8, ptr %p1, i64 1 %p3 = getelementptr inbounds i8, ptr %p2, i64 2 %p4 = getelementptr inbounds i8, ptr %p3, i64 3 ret ptr %p4 } ``` The last three GEPs will not be folded since rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. I think it is appropriate to remove this code because there is no compile-time regression reported in our benchmarks. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149240	2023-05-02 00:28:39 +08:00
Nikita Popov	2ec1d0f427	[InstCombine] Don't reassociate GEPs for loop invariance Since D146813, LICM will reassociate GEPs to expose hoisting opportunities itself. Don't perform this transform in InstCombine, where it is fragile because it depends on an optional LoopInfo analysis.	2023-04-18 12:17:07 +02:00
Florian Hahn	962c306a11	[LV] Don't consider pointer as uniform if it is also stored. Update isVectorizedMemAccessUse to also check if the pointer is stored. This prevents LV to incorrectly consider a pointer as uniform if it is used as both pointer and stored by the same StoreInst. Fixes #61396.	2023-03-17 16:26:16 +00:00
Florian Hahn	a4bb037418	[LV] Add test where pointer is incorrectly marked as uniform. Test for #61396.	2023-03-17 14:24:13 +00:00
Florian Hahn	565b98e793	[LV] Convert consecutive-ptr-uniforms.ll to use opaque pointers (NFC).	2023-03-17 14:07:11 +00:00
Sanjay Patel	ef6f23535d	Revert "[InstCombine] use loop info when running the pass after loop vectorization" This reverts commit 43ae4b62b2671cf73e691c0b53324cd39405cd51. This was intended to be practically NFC in terms of the overall opt pipeline, but there is experimental data showing that code changes occurred here: https://llvm-compile-time-tracker.com/compare.php?from=772aa05452f8ff90a47168e6801cda2acb5a1873&to=43ae4b62b2671cf73e691c0b53324cd39405cd51&stat=size-text	2023-03-11 17:28:56 -05:00
Sanjay Patel	43ae4b62b2	[InstCombine] use loop info when running the pass after loop vectorization This is the follow-up to D144199 and suggestion from D144045. We make use of loop info explicit via InstCombine pass parameter rather than semi-arbitrary via caching. The only InstCombine transform that uses LoopInfo currently is a GEP fold in visitGEPOfGEP(), so that shows up as a failure in the dedicated test for the fold as well as several LoopVectorizer tests that run extra passes. I don't see any pass manager regression tests that actually check for pass options, but this is intended to be NFC for the pass pipeline behavior - we only try to use loop info where it would have been used before via caching . Differential Revision: https://reviews.llvm.org/D144274	2023-03-11 14:20:30 -05:00
Nikita Popov	9ed2f14c87	[AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore. The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet. Differential Revision: https://reviews.llvm.org/D141912	2023-01-18 09:58:32 +01:00
Sebastian Peryt	99c9b37d11	[NFC][1/n] Remove -enable-new-pm=0 flags from lit tests This is the first patch in a series intended for removing flag -enable-new-pm=0 from lit tests. This is part of a bigger effort of completely removing legacy code related to legacy pass manager in favor of currently default new pass manager. In this patch flag has been removed only from tests where no significant change has been required because checks has been duplicated for both PMs. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D134150	2022-09-19 09:57:37 -07:00
Nikita Popov	36cbdaa163	[InstCombine] Fix inbounds preservation when swapping GEPs (PR44206) When reassociating GEPs, we can only keep inbounds if both original GEPs were inbounds, and their offsets have the same sign. For the sake of simplicity, I only handle the case where both offsets are non-negative here. It would probably be fine to just not preserve inbounds at all here, but as I don't see a compile-time impact for adding the isKnownNonNegative() calls I went with this more conservative approach. Fixes https://github.com/llvm/llvm-project/issues/44206. Differential Revision: https://reviews.llvm.org/D126687	2022-05-31 15:45:02 +02:00
Nikita Popov	ff20ee32d8	[LoopVectorize] Remove incorrect nuw flag from test (NFC) nuw does not make sense for reverse iteration.	2022-05-10 12:17:09 +02:00
Florian Hahn	95b2aa511e	[VPlan] Set VPlan header block name to vector.body. This brings the VPlan block naming in line with the naming of the generated basic blocks.	2022-04-02 19:34:32 +01:00
Sander de Smalen	28a4deab92	[LV] Fix incorrectly marking a pointer indvar as 'scalar'. collectLoopScalars should only add non-uniform nodes to the list if they are used by a load/store instruction that is marked as CM_Scalarize. Before this patch, the LV incorrectly marked pointer induction variables as 'scalar' when they required to be widened by something else, such as a compare instruction, and weren't used by a node marked as 'CM_Scalarize'. This case is covered by sve-widen-phi.ll. This change also allows removing some code where the LV tried to widen the PHI nodes with a stepvector, even though it was marked as 'scalarAfterVectorization'. Now that this code is more careful about marking instructions that need widening as 'scalar', this code has become redundant. Differential Revision: https://reviews.llvm.org/D114373	2021-11-28 09:49:28 +00:00
Simon Pilgrim	10c982e0b3	Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-08-23 21:09:26 +01:00
Simon Pilgrim	1c9bec727a	[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-07-22 10:58:51 +01:00
Florian Hahn	23c2f2e6b2	[LV] Mark increment of main vector loop induction variable as NUW. This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail. If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW. This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling. At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything. Note that this could probably be further improved by using information from the original IV. Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g Part of a set of fixes required for PR50412. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D103255	2021-06-07 10:47:52 +01:00
Juneyoung Lee	4a8e6ed2f7	[SLP,LV] Use poison constant vector for shufflevector/initial insertelement This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef. This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector. The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ). Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D94061	2021-01-06 11:22:50 +09:00
Juneyoung Lee	278aa65cc4	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Arthur Eubanks	dc93a8d1e2	[test] Fix Transforms/LoopVectorize under NPM The -enable-new-pm=1 translation caused loop-vectorize to run on all functions, then instcombine, rather than all passes on one function then the next. This caused the output of -debug-only and -print-after to be interleaved in an unexpected way.	2020-12-07 21:48:21 -08:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Daniel Neilson	9e4bbe801a	[LV] Preserve inbounds on created GEPs Summary: This is a fix for PR23997. The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP. Reviewers: mkuper, javed.absar, hsaito Reviewed By: hsaito Subscribers: dcaballe, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D46191 llvm-svn: 331269	2018-05-01 15:35:08 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00
Matthew Simpson	c86b2134c7	[LV] Consider users that are memory accesses in uniforms expansion step When expanding the set of uniform instructions beyond the seed instructions (e.g., consecutive pointers), we mark a new instruction uniform if all its loop-varying users are uniform. We should also allow users that are consecutive or interleaved memory accesses. This fixes cases where we have an instruction that is used as the pointer operand of a consecutive access but also used by a non-memory instruction that later becomes uniform as part of the expansion. llvm-svn: 297179	2017-03-07 18:47:30 +00:00
Matthew Simpson	a4964f291a	Reapply "[LV] Enable vectorization of loops with conditional stores by default" This patch reapplies r289863. The original patch was reverted because it exposed a bug causing the loop vectorizer to crash in the Python runtime on PPC. The underlying issue was fixed with r289958. llvm-svn: 289975	2016-12-16 19:12:02 +00:00
Chandler Carruth	48b4e614d8	Revert r289863: [LV] Enable vectorization of loops with conditional stores by default This uncovers a crasher in the loop vectorizer on PPC when building the Python runtime. I'll send the testcase to the review thread for the original commit. llvm-svn: 289934	2016-12-16 11:31:39 +00:00
Matthew Simpson	6a98bcfe33	[LV] Enable vectorization of loops with conditional stores by default This patch sets the default value of the "-enable-cond-stores-vec" command line option to "true". Differential Revision: https://reviews.llvm.org/D27814 llvm-svn: 289863	2016-12-15 20:11:05 +00:00
Matthew Simpson	c62266d680	[LV] Sink scalar operands of predicated instructions When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097	2016-10-25 18:59:45 +00:00
Matthew Simpson	b25e87fca5	[LV] Process pointer IVs with PHINodes in collectLoopUniforms This patch moves the processing of pointer induction variables in collectLoopUniforms from the consecutive pointer phase of the analysis to the phi node phase. Previously, if a pointer induction variable was used by both a scalarized non-memory instruction as well as a vectorized memory instruction, we would incorrectly identify the pointer as uniform. Pointer induction variables should be treated the same as other phi nodes. That is, they are uniform if all users of the induction variable and induction variable update are uniform. Differential Revision: https://reviews.llvm.org/D24511 llvm-svn: 281485	2016-09-14 14:47:40 +00:00
Matthew Simpson	bfe5e1817b	[LV] Ensure proper handling of multi-use case when collecting uniforms The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992	2016-09-08 21:38:26 +00:00
Matthew Simpson	408a3abcfe	[LV] Don't mark pointers used by scalarized memory accesses uniform Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979	2016-09-08 19:11:07 +00:00

33 Commits