llvm-project

Author	SHA1	Message	Date
Florian Hahn	8f781b96e2	Revert "[VPlan] Mark recurrence recipes as not having side-effects." This reverts commit 02369b75fdd7b5fc5d9b47f1b60587c225918511. At the moment, live-outs used only for the resume values in the scalar loop are not modeled in VPlan yet. This means first-order recurrence recipes could be removed, when a scalar epilogue is required and the only use of a FOR is outside the loop. Keep treating recurrence recipes as having side-effects for now, to avoid them being removed. Fixes #62954.	2023-06-06 11:35:26 +02:00
Florian Hahn	f47084ecfb	[LV] Use force-vector-width for X86 recurrence test. This makes sure that all tests that can be vectorized in the file are vectorized.	2023-06-06 11:27:35 +02:00
Florian Hahn	4c51a45e80	[LV] Add test for #62954 .	2023-06-06 11:20:22 +02:00
Florian Hahn	3b912e269a	[LV] Bail out on loop-variant steps when rewriting SCEV exprs. If the step is not loop-invariant, we cannot create a modified AddRec, as the start needs to be loop-invariant. Mark those cases as CannotAnalyze and bail out, to fix a crash.	2023-06-01 16:14:02 +01:00
Florian Hahn	572cfa3fde	[LV] Use SCEV for uniformity analysis across VF This patch uses SCEV to check if a value is uniform across a given VF. The basic idea is to construct SCEVs where the AddRecs of the loop are adjusted to reflect the version in the vectorized loop (Step multiplied by VF). We construct a SCEV for the value of the vector lane 0 (offset 0) compare it to the expressions for lanes 1 to the last vector lane (VF - 1). If they are equal, consider the expression uniform. While re-writing expressions, we also need to catch expressions we cannot determine uniformity (e.g. SCEVUnknown). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D148841	2023-05-31 16:01:00 +01:00
Florian Hahn	8098f2577e	[LV] Use Legal::isUniform to detect uniform pointers. Update collectLoopUniforms to identify uniform pointers using Legal::isUniform. This is more powerful and brings pointer classification here in sync with setCostBasedWideningDecision which uses isUniformMemOp. The existing mis-match in reasoning can causes crashes due to D134460, which is fixed by this patch. Fixes https://github.com/llvm/llvm-project/issues/60831. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150991	2023-05-30 16:42:55 +01:00
Florian Hahn	fcc135a8d6	[LV] Remove dead CHECK lines after 280656eae95a9cbf. Those check lines were left over after adding new run lines in 280656eae95a9cbf.	2023-05-29 19:23:52 +01:00
Florian Hahn	280656eae9	[LV] Add check line with VF=4 to uniformity test. Extend test coverage for D148841.	2023-05-28 20:01:04 +01:00
Nikita Popov	d2502eb091	[KnownBits] Add support for nuw/nsw on shifts Implement precise nuw/nsw support in the KnownBits implementation, replacing the rather crude handling in ValueTracking. Differential Revision: https://reviews.llvm.org/D151208	2023-05-25 10:17:10 +02:00
Florian Hahn	299f0ff60e	[VPlan] Print IR flags for VPRecipeWithIRFlags. Now that IR flags are modeled as part of VPRecipeWithIRFlags, include the flags when printing recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150029	2023-05-23 20:36:16 +01:00
Dinar Temirbulatov	1ff828c6c8	[AArch64][LV] Disable maximising bandwidth for streaming compatible sve We noticed some runtime performance improvements by disabling maximising bandwidth for streaming compatible sve. Differential Revision: https://reviews.llvm.org/D150336	2023-05-23 12:58:19 +00:00
David Sherwood	c7dbe326df	[AArch64][LoopVectorize] Enable tail-folding of simple loops on neoverse-v1 This patch enables the tail-folding of simple loops by default when targeting the neoverse-v1 CPU. Simple loops exclude those with recurrences or reductions or loops that are reversed. New tests have been added here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll In terms of SPEC2017 only one benchmark is really affected when building with "-Ofast -mcpu=neoverse-v1 -flto", which is (+ faster, - slower): 525.x264: +7.0% Differential Revision: https://reviews.llvm.org/D130618	2023-05-18 10:35:57 +00:00
Florian Hahn	01efcec6db	[LV] Add extra uniformity tests with UDIV and UREM. Extra tests for D148841.	2023-05-18 11:35:17 +01:00
Nikita Popov	745cfa3449	[InstCombine] Compute known bits for multi-use add/sub We were failing to set the known bits for add/sub in the multi-use case, resulting in odd behavioral differences depending on the number of uses. Noticed while adding a consistency assertion. The test changes are essentially a revert to the state before d6498ab. These changes are not really desirable, but if we don't want them, that needs to be handled as part of the heuristic for demanded constant shrinking, not by artifically suppressing the known bits in one specific case.	2023-05-17 17:50:00 +02:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
David Sherwood	7beb2ca8fa	[AArch64][NFC] Refactor the tail-folding option This patch does simple refactoring of the tail-folding option in preparation for enabling tail-folding by default for neoverse-v1. It adds a default tail-folding option field to the AArch64Subtarget class that can be set on a per-CPU. Differential Revision: https://reviews.llvm.org/D149659	2023-05-17 08:39:40 +00:00
Florian Hahn	6c35d423c8	[VPlan] Add tests to print exact and flags on calls (NFC). Adds missing test coverage for D150029.	2023-05-16 21:18:31 +01:00
Douglas Yung	da3c06a482	Revert "[LV] Add test case for #51677." This reverts commit 77df976a1219c0c6fd102358c15e71747aab4443. Test is failing on many build bots including: https://lab.llvm.org/buildbot/#/builders/247/builds/4488 https://lab.llvm.org/buildbot/#/builders/139/builds/40608 https://lab.llvm.org/buildbot/#/builders/216/builds/21169 https://lab.llvm.org/buildbot/#/builders/65/builds/9673 https://lab.llvm.org/buildbot/#/builders/119/builds/13302 https://lab.llvm.org/buildbot/#/builders/121/builds/30459 https://lab.llvm.org/buildbot/#/builders/230/builds/12967 https://lab.llvm.org/buildbot/#/builders/57/builds/26781 https://lab.llvm.org/buildbot/#/builders/214/builds/7458 https://lab.llvm.org/buildbot/#/builders/93/builds/14892 https://lab.llvm.org/buildbot/#/builders/231/builds/11764	2023-05-14 12:22:11 -07:00
Ricky Zhou	77df976a12	[LV] Add test case for #51677 .	2023-05-14 16:53:08 +01:00
Florian Hahn	3d4eed0133	[LV] Reuse SCEV expansion results for epilogue vectorization. When generating code for the epilogue vector loop, we need to re-use the expansion results for induction steps generated for the main vector loop, as the pre-header of the epilogue vector loop may not dominate the vector preheader of the epilogue. This fixes a reported crash. Note that this is a workaround which should be removed soon once induction resume value creation is handled in VPlan directly.	2023-05-11 22:00:07 +01:00
Florian Hahn	236a0e82df	[LV] Use VPValue to get expanded value for SCEV step expressions. Update skeleton creation logic to use SCEV expansion results from expanding the pre-header. This avoids another set of SCEV expansions that may happen after the CFG has been modified. Fixes #58811. Depends on D147964. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147965	2023-05-11 16:49:19 +01:00
Hongtao Yu	9272d0f079	[PseudoProbe] Clean up dwarf discriminator and avoid duplicating factor. A pseudo probe is created with dwarf line information shared with its nearest instruction. If the instruction comes with a dwarf discriminator, it will be shared with the probe as well. This can confuse the later FS-AFDO discriminator assignment pass. To fix this, I'm cleaning up the discriminator fields for probes when they are inserted. I also notice another possibility to change the discriminator field of pseudo probes in the pipeline before the FS discriminator assignment pass. That is the loop unroller, which assigns duplication factor to instruction being vectorized. I'm disabling that for pseudo probe intrinsics specifically, also for callsites with probes. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D148569	2023-05-10 11:26:23 -07:00
Florian Hahn	faa8f582b9	[VPlan] Add printing test with fast-math flags. Add missing test coverage for D150029.	2023-05-09 22:43:03 +01:00
Noah Goldstein	7770b0abfd	[KnownBits] Improve `KnownBits::rem(X, Y)` in cases where we can deduce low-bits of output The first `cttz(Y)` bits in `X` are translated 1-1 in the output. Alive2 Links: https://alive2.llvm.org/ce/z/Qc47p7 https://alive2.llvm.org/ce/z/19ut5H Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149421	2023-05-07 19:11:53 -05:00
Florian Hahn	e3afe0b89d	[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI). To generate cast instructions, the result type is needed. To allow creating widened casts without underlying instruction, introduce a new VPWidenCastRecipe that also holds the result type. This functionality will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149081	2023-05-05 13:20:16 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	79692750d2	[LV] Use VPValue for SCEV expansion in fixupIVUsers. The step is already expanded in the VPlan. Use this expansion instead. This is a step towards modeling fixing up IV users in VPlan. It also fixes a crash casued by SCEV-expanding the Step expression in fixupIVUsers, where the IR is in an incomplete state Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147963	2023-05-04 09:25:59 +01:00
Philip Reames	cb3cb417a0	[LV] Refresh some auto-gen tests to reduce diff [nfc]	2023-05-01 13:55:11 -07:00
Philip Reames	30cdb2ac7e	[LAA] Add command line flag to disable unit stride speculation This is purely so that we can expose and work through downstream codegen issues. My intention is to see if we can get this disabled by default, but that requires fixing a bunch of downstream issues first.	2023-05-01 10:49:51 -07:00
Yingwei Zheng	6d667d4b26	[InstCombine] Combine const GEP chains This patch reverts rGae739aefd7473517d3f08b5c8d08a66c7f469198 to address performance regressions reported by our [CI](https://github.com/dtcxzyw/llvm-ci/issues/137) after rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. For example: ``` define ptr @const_gep_chain(ptr %p, i64 %a) { %p1 = getelementptr inbounds i8, ptr %p, i64 %a %p2 = getelementptr inbounds i8, ptr %p1, i64 1 %p3 = getelementptr inbounds i8, ptr %p2, i64 2 %p4 = getelementptr inbounds i8, ptr %p3, i64 3 ret ptr %p4 } ``` The last three GEPs will not be folded since rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. I think it is appropriate to remove this code because there is no compile-time regression reported in our benchmarks. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149240	2023-05-02 00:28:39 +08:00
Noah Goldstein	d840391401	[ValueTracking] Add logic for `isKnownNonZero(smin/smax X, Y)` For `smin` if either `X` or `Y` is negative, the result is non-zero. For `smax` if either `X` or `Y` is strictly positive, the result is non-zero. For both if `X != 0` and `Y != 0` the result is non-zero. Alive2 Link: https://alive2.llvm.org/ce/z/7yvbgN https://alive2.llvm.org/ce/z/zizbvq Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149417	2023-04-30 10:06:46 -05:00
Noah Goldstein	883daa7ac4	[ValueTracking] Add logic for `isKnownNonZero(umax X, Y)` `(umax X, Y) != 0` -> `X != 0 \|\| Y != 0` Alive2 Link: https://alive2.llvm.org/ce/z/_Z9AUT Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149415	2023-04-30 10:06:46 -05:00
Mel Chen	de01dba7f2	[LV] Add tests for integer min max with index reduction pattern. (NFC) The test case for signed max with index, include strict and non-strict max. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D146718	2023-04-28 05:13:23 -07:00
Florian Hahn	07e5f57df4	[LV] Add tests for #60831 . Also contains an extra test mentioned in D144434.	2023-04-28 10:42:01 +01:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
Florian Hahn	883eb88cae	[LV] Add extra uniformity tests with LSHR and AND. Extra tests for D148841 based on the tests added in 95539186c82604f783.	2023-04-26 19:51:35 +01:00
Nikita Popov	1745341296	[LoopVectorize] Preserve SCEV As far as I can tell, LoopVectorize preserves SCEV, mainly by dint of forgetting the loop being vectorized. We should mark it as preserved in the pass manager. This is a very small compile-time improvement. Differential Revision: https://reviews.llvm.org/D149147	2023-04-26 09:43:54 +02:00
Vasileios Porpodas	95539186c8	[LV][NFC] Precommit test for a follow-up patch that introduces uniformity for a specific VF. Differential Revision: https://reviews.llvm.org/D147734	2023-04-25 14:12:22 -07:00
Paul Osmialowski	9cf1881f8f	[SCEV] Do not plant SCEV checks unnecessarily The vectorisation analysis collects strides for loop invariant pointers, which is wrong because they are not strided. We don't need to generate SCEV checks (which are costly performancewise) for such pointers, we just need to do the appropriate aliasing checks. This patch fixes the problem by changing getStrideFromPointer() to treat loop invariant pointers as having no stride. Originally proposed by David Sherwood with further suggestions from Florian Hahn. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D146958	2023-04-25 21:47:14 +01:00
Paul Osmialowski	20d0f80dd3	[test] A test case for D146958 This commit introduces a test for the change introduced by the `[SCEV] Do not plant SCEV checks unnecessarily` commit, D146958. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D146974	2023-04-25 21:36:20 +01:00
Nikita Popov	d9dcd52d16	[LoopVectorize] Convert test to opaque pointers (NFC)	2023-04-25 15:28:01 +02:00
David Green	1869a9c225	[LV] Use the known trip count when costing non-tail folded VFs Now that we store the ScalarCost in the VectorizationFactor it is possible to use it to get a slightly more accurate cost in isMoreProfitable between two vector factors. This extends the logic added in D101726 to non-tail-folded cases, using the costs of `VecCost * (TripCount / VF) + ScalarCost * (TripCount % VF)` to compare VFs where the TripCount is known and we are not folding the tail. This shouldn't alter very much as small trip counts are usually not vectorized, but does seem to help in the testcase where 4 * VF4 is chosen as profitable compared to 2 * VF8 + 4 * scalar. Differential Revision: https://reviews.llvm.org/D147720	2023-04-24 22:02:30 +01:00
Jay Foad	593e25ffae	[Vectorize] Fix vectorization, scalarization and folding of llvm.is.fpclass llvm.is.fpclass is different from other vectorizable intrinsics in that it is overloaded on an argument type, not on the return type. Differential Revision: https://reviews.llvm.org/D148905	2023-04-24 13:42:08 +01:00
Jay Foad	3237497d01	[Vectorize] Pre-commit tests for D148905 Differential Revision: https://reviews.llvm.org/D149050	2023-04-24 13:42:08 +01:00
Nikita Popov	d003c01c30	[LV][IndVars] Move test to correct directory and regenerate (NFC) For some reason, an IndVarSimplify test was in the LoopVectorize directory.	2023-04-21 18:03:41 +02:00
Florian Hahn	6b8d19d2b5	Recommit "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5. The new version of the patch adds a set to avoid duplicating work in isFixedOrderRecurrence, which was previously done through the removed SinkAfter map. Original commit message: Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-20 09:31:16 +01:00
Nikita Popov	2ec1d0f427	[InstCombine] Don't reassociate GEPs for loop invariance Since D146813, LICM will reassociate GEPs to expose hoisting opportunities itself. Don't perform this transform in InstCombine, where it is fragile because it depends on an optional LoopInfo analysis.	2023-04-18 12:17:07 +02:00
Nikita Popov	39ca70392b	[LV] Regenerate test checks (NFC)	2023-04-18 11:52:36 +02:00
Graham Hunter	d8c49d2ac9	[LV][AArch64] Autogenerate checks for scalable-strict-fadd.ll (NFC) Precommit for D145163.	2023-04-18 10:25:05 +01:00
Manoj Gupta	3d8ed8b519	Revert "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts commit 7fc0b3049df532fce726d1ff6869a9f6e3183780. Causes a clang hang when building xz utils, github issue #62187.	2023-04-17 12:19:36 -07:00

1 2 3 4 5 ...

2078 Commits