llvm-project

Author	SHA1	Message	Date
Min-Yih Hsu	f6315a9572	[AArch64][LoopIdiom] Disable LoopIdiomTransform when NoImplicitFloat is present (#87677 ) This behavior is aligned with both LoopVectorizer and SLPVectorizer.	2024-04-08 09:10:23 -07:00
paperchalice	29bf32efbb	[NewPM][AArch64] Add AArch64PassRegistry.def (#85215 ) PR #83567 ports `SelectionDAGISel` to the new pass manager, then each backend should provide `<Target>DagToDagISel()` in new pass manager style. Then each target should provide `<Target>PassRegistry.def` to register backend passes in `registerPassBuilderCallbacks` to reduce duplicate code. This PR adds `AArch64PassRegistry.def` to AArch64 backend and boilerplate code in `registerPassBuilderCallbacks`.	2024-03-21 10:57:51 +08:00
paperchalice	44a81af510	[AArch64] Run LoopSimplifyPass in byte-compare-index.ll (#86053 ) Make this test case work on both new and legacy pass manager. See also #85215	2024-03-21 10:26:58 +08:00
Nikita Popov	07292b7203	[LIR][SCEVExpander] Restore original flags when aborting transform (#82362 ) SCEVExpanderCleaner will currently remove instructions created by SCEVExpander, but not restore poison generating flags that it may have dropped. As such, running LIR can currently spuriously drop flags without performing any transforms. Fix this by keeping track of original instruction flags in SCEVExpander. Fixes https://github.com/llvm/llvm-project/issues/82337.	2024-02-21 10:13:41 +01:00
Nikita Popov	fcd6549e58	[LIR] Add test for #82337 (NFC)	2024-02-20 14:42:40 +01:00
Nikita Popov	bec7181d5b	[SCEVExpander] Don't use recursive expansion for ptr IV inc Similar to the non-ptr case, directly create the getelementptr instruction. Going through expandAddToGEP() no longer makes sense with opaque pointers, where generating the necessary instruction is trivial. This avoids recursive expansion of (the SCEV of) StepV while the IR is in an inconsistent state, in particular with an incomplete IV phi node, which utilities may not be prepared to deal with. Fixes https://github.com/llvm/llvm-project/issues/80954.	2024-02-07 11:27:26 +01:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
paperchalice	e390c229a4	[Pass] Add hyphen to some pass names (#74287 ) Here is the list of the renamed passes: - `callbrprepare` -> `callbr-prepare` - `dwarfehprepare` -> `dwarf-eh-prepare` - `flattencfg` -> `flatten-cfg` - `loweratomic` -> `lower-atomic` - `lowerinvoke` -> `lower-invoke` - `lowerswitch` -> `lower-switch` - `winehprepare` -> `win-eh-prepare` - `targetir` -> `target-ir` - `targetlibinfo` -> `target-lib-info` Legacy passes are not affected.	2024-01-25 16:05:54 +08:00
David Sherwood	fca6992be1	[AArch64] Fix a minor issue with AArch64LoopIdiomTransform (#78136 ) I found another case where in the end block we could have a PHI that we deal with incorrectly. The two incoming values are unique - one of them is the induction variable and another one is a value defined outside the loop, e.g. %final_val = phi i32 [ %inc, %while.body ], [ %d, %while.cond ] We won't correctly select between the two values in the new end block that we create and so we will get the wrong result.	2024-01-17 14:30:06 +00:00
David Sherwood	ccaf9e0bc0	[AArch64] Enable AArch64 loop idiom transform pass (#77480 ) Following on from https://github.com/llvm/llvm-project/pull/72273 which added the new AArch64 loop idiom transformation pass, this patch enables the pass by default for AArch64.	2024-01-10 10:03:14 +00:00
David Sherwood	2c651e6c38	[AArch64] Fix regression introduced by c7148467fc08eefaaae876c7d11d62… (#77467 ) …9c849f42cf	2024-01-09 13:22:28 +00:00
David Sherwood	c7148467fc	[AArch64] Add an AArch64 pass for loop idiom transformations (#72273 ) We have added a new pass that looks for loops such as the following: ``` while (i != max_len) if (a[i] != b[i]) break; ... use index i ... ``` Although similar to a memcmp, this is slightly different because instead of returning the difference between the values of the first non-matching pair of bytes, it returns the index of the first mismatch. As such, we are not able to lower this to a memcmp call. The new pass can now spot such idioms and transform them into a specialised predicated loop that gives a significant performance improvement for AArch64. It is intended as a stop-gap solution until this can be handled by the vectoriser, which doesn't currently deal with early exits. This specialised loop makes use of a generic intrinsic that counts the trailing zero elements in a predicate vector. This was added in https://reviews.llvm.org/D159283 and for SVE we end up with brkb & incp instructions. Although we have added this pass only for AArch64, it was written in a generic way so that in theory it could be used by other targets. Currently the pass requires scalable vector support and needs to know the minimum page size for the target, however it's possible to make it work for fixed-width vectors too. Also, the llvm.experimental.cttz.elts intrinsic used by the pass has generic lowering, but can be made efficient for targets with instructions similar to SVE's brkb, cntp and incp. Original version of patch was posted on Phabricator: https://reviews.llvm.org/D158291 Patch co-authored by Kerry McLaughlin (@kmclaughlin-arm) and David Sherwood (@david-arm) See the original discussion on Discourse: https://discourse.llvm.org/t/aarch64-target-specific-loop-idiom-recognition/72383	2024-01-09 11:29:28 +00:00
Yingwei Zheng	2c2de4b20e	[ValueTracking] Remove SPF support from `computeKnownBitsFromOperator` (#76630 ) This patch removes redundant SPF support (`5350e1b509`) from `computeKnownBitsFromOperator` as we always canonicalize a SPF into an intrinsic call. Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=3dc0638cfc19e140daff7bf1281648daca8212fa&to=8771ef0749fb2ba4304dc68d418c88ec5769346f&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| -0.01%\|-0.01%\|+0.01%\|+0.00%\|+0.01%\|+0.04%\|-0.01%\|	2023-12-31 04:38:18 +08:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Jeremy Morse	d2d9dc8eb4	[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251 ) Debugify is extremely useful as a testing and debugging tool, and a good number of LLVM-IR transform tests use it. We need it to support "new" non-instruction debug-info to get test coverage, but it's not important enough to completely convert right now (and it'd be a large undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and exit of the pass, which gives us the functionality without any further work. The cost is compile-time, but again this is only happening during tests. Tested by: the large set of debugify tests enabled here. Note the InstCombine test (cast-mul-select.ll) that hasn't been fully enabled: this is because there's a debug-info sinking piece of code there that hasn't been instrumented.	2023-11-29 13:19:50 +00:00
Philip Reames	f8742b8d6a	[SCEV] Teach SCEVExpander to use zext nneg when possible (#70815 ) zext nneg was recently added to the IR in #67982. Teaching SCEVExpander to emit nneg when possible is valuable since SCEV may have proved non-trivial facts about loop bounds which would otherwise be lost when materializing the value.	2023-10-31 09:33:07 -07:00
Philip Reames	6485978120	Refresh a couple of auto-gen tests [nfc] Reducing spurious diff in an upcoming review.	2023-10-31 07:46:01 -07:00
Nikita Popov	97f1db2fdd	[LoopIdimo] Use tryZExtValue() instead of getZExtValue() To avoid an assertion for large BECounts. I also suspect that this code is missing an overflow check. Fixes https://github.com/llvm/llvm-project/issues/70008.	2023-10-24 11:05:42 +02:00
Jeremy Morse	1ce1732f82	[DebugInfo] Use getStableDebugLoc to pick IRBuilder DebugLocs When IRBuilder is given an insertion position and there is debug-info, it sets the DebugLoc of newly inserted instructions to the DebugLoc of the insertion position. Unfortunately, that means if you insert in front of a debug intrinsics, your "real" instructions get potentially-misleading source locations from the debug intrinsics. Worse, if you compile -gmlt to get source locations but no variable locations, you'll get different source locations to a normal -g build, which is silly. Rectify this with the getStableDebugLoc method, which skips over any debug intrinsics to find the next "real" instruction. This is the source location that you would get if you compile with -gmlt, and it remains stable in the presence of debug intrinsics. The changed tests show a few locations where this has been happening, for example selecting line-zero locations for instrumentation on a perfectly valid call site. Differential Revision: https://reviews.llvm.org/D159485	2023-09-11 19:00:44 +01:00
Nikita Popov	69bd66b3ce	[Tests] Remove some and/or constant expressions in tests (NFC) In preparation for their removal in D158081.	2023-08-21 12:05:32 +02:00
Nikita Popov	174300a283	[LoopIdiom] Regenerate test checks (NFC)	2023-07-21 10:12:05 +02:00
William S. Moses	3eb6fefb97	[LoopIdiom] Preserve alias information for memset_pattern TBAA/NoAlias/AliasScope and other information is currently preserved when upgrading to a memcpy/memset. However, this is missing when upgrading to the macOS memset_pattern function. This adds the same alias information preservation to memset_pattern Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D152934	2023-06-14 16:14:53 -04:00
luxufan	e9ddb584e8	[LoopIdiom] Freeze BitPos if !isGuaranteedNotToBeUndefOrPoison Fixes: https://github.com/llvm/llvm-project/issues/62873 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151690	2023-06-07 14:50:22 +08:00
Nikita Popov	d5c56c5162	[SCEVExpander] Remember phi nodes inserted by LCSSA construction SCEVExpander keeps track of all instructions it inserted. However, it currently misses some phi nodes created during LCSSA construction. Fix this by collecting these into another argument. This also removes the IRBuilder argument, which was added for essentially the same purpose, but only handles the root LCSSA nodes, not those inserted by SSAUpdater. This was reported as a regression on D149344, but the reduced test case also reproduces without it. Differential Revision: https://reviews.llvm.org/D150681	2023-05-25 09:34:19 +02:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
OCHyams	72776850ed	Revert "[DebugInfo] Print empty MDTuples wrapped in MetadataAsValue inline" This reverts commit 1e6fe677f8aa98518e05218affa16e468819f5ed (D140900). Buildbot: https://lab.llvm.org/buildbot/#/builders/196/builds/29937	2023-04-25 14:37:25 +01:00
OCHyams	1e6fe677f8	[DebugInfo] Print empty MDTuples wrapped in MetadataAsValue inline This improves the readability of debugging intrinsics. Instead of: call void @llvm.dbg.value(metadata !2, ...) !2 = !{} We will see: call void @llvm.dbg.value(metadata !{}, ...) !2 = !{} Note that we still get a numbered metadata entry for the node even if it's not used elsewhere. This is to avoid adding more context to the print functions. This is already legal IR - LLVM can parse and understand it - so there is no need to update the parser. The next patches in this stack will make such empty metadata operands more common and semantically important. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140900	2023-04-25 14:13:47 +01:00
Craig Topper	8bba57b1f1	[LoopIdiomRecognize] Remove NUW flag from SCEV in getTripCount. Based on the conversation in D147355. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148170	2023-04-13 11:58:10 -07:00
Tim Northover	150595ab4b	LoopIdiom: avoid patterned memset if constant is not relocatable. The pattern we're using for the memset_pattern* call gets put into a static global variable initialized, which means it has to be representable with relocations on the target. Most `ConstantExpr` instances do not satisfy this constraint, so avoid all of them for now.	2023-01-12 18:53:07 +00:00
Nikita Popov	7a752e8108	[LoopIdiom] Convert tests to opaque pointers (NFC) The differences here are due to SCEVExpander producing GEPs with explicit offset calculation, a known difference with opaque pointers.	2023-01-06 11:36:37 +01:00
Nikita Popov	89f1876b61	[LoopIdiom] Name instructions in test (NFC)	2023-01-06 11:07:57 +01:00
Nikita Popov	055fb7795a	[Transforms] Convert some tests to opaque pointers (NFC) These are all tests where conversion worked automatically, and required no manual fixup.	2023-01-05 12:43:45 +01:00
Roman Lebedev	45fcdaf6b6	[NFC] Port all LoopIdiom tests to `-passes=` syntax	2022-12-08 02:38:46 +03:00
Roman Lebedev	48c6b2729e	[NFC] Port all LoopIdiom tests to `-passes=` syntax	2022-12-07 23:15:16 +03:00
Arthur Eubanks	f3a928e233	[opt] Don't translate legacy -analysis flag to require<analysis> Tests relying on this should explicitly use -passes='require<analysis>,foo'.	2022-10-07 14:54:34 -07:00
Simon Pilgrim	37dc4373aa	[LoopIdiom] Add non-LZCNT target test coverage	2022-09-19 18:13:11 +01:00
Simon Pilgrim	6b4d409f69	[CostModel][X86] Add CostKinds handling for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF instructions This was achieved with the 'cost-tables vs llvm-mca' script D103695	2022-09-19 17:37:58 +01:00
Simon Pilgrim	95c2c9c5c5	[LoopIdiom][X86] Add non-LZCNT test coverage to 'rshift until zero' idiom tests	2022-09-16 17:23:54 +01:00
Eli Friedman	abdf0da800	[LoopIdiom] Fix bailout for aliasing in memcpy transform. Commit dd5991cc modified the aliasing checks here to allow transforming a memcpy where the source and destination point into the same object. However, the change accidentally made the code skip the alias check for other operations in the loop. Instead of completely skipping the alias check, just skip the check for whether the memcpy aliases itself. Differential Revision: https://reviews.llvm.org/D126486	2022-05-31 17:24:23 -07:00
Dávid Bolvanský	260679b000	[NFCI] Regenerate LoopIdiomRecognize test checks	2022-04-04 00:21:26 +02:00
Stephen Long	e02f4976ac	[LoopIdiom] Merge TBAA of adjacent stores when creating memset Factor in the TBAA of adjacent stores instead of just the head store when merging stores into a memset. We were seeing GVN remove a load that had a TBAA that matched the 2nd store because GVN determined it didn't match the TBAA of the memset. The memset had the TBAA of only the first store. i.e. Loading the field pi_ of shared_count after memset to create an array of shared_ptr template<class T> class shared_ptr { T p; shared_count refcount; }; class shared_count { sp_counted_base pi_; }; Differential Revision: https://reviews.llvm.org/D122205	2022-03-30 16:54:49 -07:00
Nikita Popov	d9715a7266	[SCEV] Don't try to reuse expressions with offset SCEVs ExprValueMap currently tracks not only which IR Values correspond to a given SCEV expression, but additionally stores that it may be expanded in the form X+Offset. In theory, this allows reusing existing IR Values in more cases. In practice, this doesn't seem to be particularly useful (the test changes are rather underwhelming) and adds a good bit of complexity. Per https://github.com/llvm/llvm-project/issues/53905, we have an invalidation issue with these offseted expressions. Differential Revision: https://reviews.llvm.org/D120311	2022-02-25 09:16:48 +01:00
William S. Moses	8cb9c73609	[LoopIdiom] Keep TBAA when creating memcpy/memmove When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept. Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D108221	2022-01-31 16:28:13 -05:00
Florian Hahn	782c0dd1a1	[IRBuilder] Migrate and-folding to value-based FoldAnd. Similar to the migration of or-folding to FoldOr, there are a few cases where the fold in IRBuilder::CreateAnd triggered directly. Those have been updated. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117431	2022-01-20 10:22:21 +00:00
Alex Bradbury	33d008b169	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131	2022-01-12 19:33:44 +00:00
eopXD	bc17d32a5f	[LoopIdiom] Let LIR fold memset pointer / stride SCEV regarding loop guards Expression guraded in loop entry can be folded prior to comparison. This patch proceeds D107353 and makes LIR able to deal with nested for-loop. Reviewed By: qianzhen, bmahjour Differential Revision: https://reviews.llvm.org/D108112	2021-12-13 09:36:58 -08:00
Roman Lebedev	b291597112	Revert rest of `IRBuilderBase`'s short-circuiting folds Upon further investigation and discussion, this is actually the opposite direction from what we should be taking, and this direction wouldn't solve the motivational problem anyway. Additionally, some more (polly) tests have escaped being updated. So, let's just take a step back here. This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88. This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b. This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2. This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.	2021-10-28 02:15:14 +03:00
Roman Lebedev	42712698fd	Revert "[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`" Clang OpenMP codegen tests are failing. This reverts commit 288f1f8abe5835180a0021f142043ee261ab3846. This reverts commit cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.	2021-10-27 22:21:37 +03:00
Roman Lebedev	cb90e5356a	[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x` There's precedent for that in `CreateOr()`/`CreateAnd()`. The motivation here is to avoid bloating the run-time check's IR in `SCEVExpander::generateOverflowCheck()`. Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 21:34:38 +03:00
Roman Lebedev	749581d21f	[IR] `IRBuilderBase::CreateAnd()`: fix short-circuiting for constant on LHS Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 18:01:06 +03:00

1 2 3 4 5 ...

267 Commits