llvm-project

Author	SHA1	Message	Date
Roman Lebedev	1578c670ff	[NFC][SROA] Variably-indexed load: add test variation w/ upper half of alloca being zeros This is the actual pattern i'm looking at.	2022-12-23 20:16:41 +03:00
Roman Lebedev	85bb24ab52	[NFC][SROA] Rewrite widen-load-of-small-alloca tests to just store result, not call some function	2022-12-23 02:43:22 +03:00
Roman Lebedev	967ba1a86d	[NFC][SROA] More tests for variable indexed promotion	2022-12-22 03:12:53 +03:00
Roman Lebedev	4d255f9e33	[NFC][SROA] More tests for variable indexed promotion	2022-12-22 01:36:42 +03:00
Roman Lebedev	eb7c515d66	[NFC][SROA] More tests for promotion with variable index Also, delete the InstCombine test, it's not going to be relevant.	2022-12-22 01:08:50 +03:00
Roman Lebedev	cd7428cac7	[NFC][SROA] Add tests for alloca promotion in presence of variably-indexed load	2022-12-21 23:16:50 +03:00
Jeremy Morse	2b1d45b227	[NFC] Add --check-globals to an autogen test cmdline In c6d7e80ec4c17 this test was converted from hand written to autogenerated, during which the relevant metadata CHECKs were dropped. In D85172 the intention of the CHECK lines is to ensure that for two dbg.declares with different inlining scopes, attached to the same alloca, two sets of dbg.values will be generated with the same set of inlining scopes. Without metadata checks, a single DILocation can match the !dbg CHECKs.	2022-12-21 16:49:05 +00:00
Joshua Cranmer	e6b02214c6	[IR] Add a target extension type to LLVM. Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizations. This patch doesn't add any uses of these types by an existing backend, it only provides basic infrastructure such that these types would work correctly. Reviewed By: nikic, barannikov88 Differential Revision: https://reviews.llvm.org/D135202	2022-12-20 11:02:11 -05:00
Roman Lebedev	96d3c82645	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3)" While the PPC litte-endian miscompile did get addressed by https://reviews.llvm.org/D140046 the PPV big-endian bots are still unhappy. https://lab.llvm.org/buildbot/#/builders/93/builds/12560 This reverts commit 7bd358bcb4e358b4351c69e02ef76939e08acdc7.	2022-12-16 22:58:41 +03:00
Roman Lebedev	cfd594f8bb	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3) * This is a recommit of 3c4d2a03968ccf5889bacffe02d6fa2443b0260f, * which was reverted in 25f01d593ce296078f57e872778b77d074ae5888, because it exposed a miscompile in PPC backend, which was resolved in https://reviews.llvm.org/D140089 / cb3f415cd2019df7d14683842198bc4b7a492bc5. * which was a recommit of cf624b23bc5d5a6161706d1663def49380ff816a, * which was reverted in 5cfc22cafe3f2465e0bb324f8daba82ffcabd0df, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, 5and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-12-16 19:27:38 +03:00
Roman Lebedev	89a6106ce5	[SROA] Rewrite store-into-selected-address into predicated stores Same basic idea as with unfolding loads into predicated loads, but we obviously can't have speculative stores.	2022-12-10 21:07:03 +03:00
Roman Lebedev	6bd3a02e2d	[NFC][SROA] Add tests with store-into-select-of-addrs	2022-12-10 21:07:03 +03:00
Roman Lebedev	4f7e5d2206	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node, take 2 Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238 This reverts commit 739611870d3b06605afe25cc07833f6a62de9545, and recommits 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5 with a fixed assertion - we should check that DTU is there, not just assert false...	2022-12-08 20:19:55 +03:00
Roman Lebedev	739611870d	Revert "[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node" The assertion about not modifying the CFG seems to not hold, will recommit in a bit. https://lab.llvm.org/buildbot#builders/139/builds/32412 This reverts commit 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5. This reverts commit 4f90f4ada33718f9025d0870a4fe3fe88276b3da.	2022-12-08 19:51:15 +03:00
Roman Lebedev	03e6d9d9d1	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238	2022-12-08 16:51:32 +03:00
Bjorn Pettersson	3528e63d89	[test] Remove duplicate RUN lines in Transform tests	2022-12-08 11:47:16 +01:00
Roman Lebedev	c857c49c0c	[NFC] Port all SROA tests to `-passes=` syntax	2022-12-08 02:38:50 +03:00
Matt Arsenault	27387896cf	SROA: Simplify addrspacecasted allocas with volatile accesses If the alloca is accessed through an addrspacecasted pointer, allow the normal changes on the alloca. Cast back to the original use address space instead of the new alloca's natural address space.	2022-12-02 15:20:56 -05:00
Roman Lebedev	b731e5d1db	[NFC][SROA] A few more tests for D138238	2022-12-01 23:17:14 +03:00
Roman Lebedev	c6d7e80ec4	[NFC][SROA] Ensure that all check lines in SROA tests are autogenerated	2022-12-01 01:18:23 +03:00
Bjorn Pettersson	0676acb6fd	[test] Switch to use -passes syntax in a bunch of test cases Should cover most of the tests for GVN, GVNHoist, GVNSink, GlobalOpt, GlobalSplit, InstCombine, Reassociate, SROA and TailCallElim that had not been updated earlier.	2022-11-29 13:29:02 +01:00
Roman Lebedev	25f01d593c	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2)" TableGen is still getting miscompiled on PPC buildbots. Sent a mail with request for help. This reverts commit 3c4d2a03968ccf5889bacffe02d6fa2443b0260f.	2022-11-27 00:00:06 +03:00
Roman Lebedev	3c4d2a0396	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2) This is a recommit of cf624b23bc5d5a6161706d1663def49380ff816a, which was reverted in 5cfc22cafe3f2465e0bb324f8daba82ffcabd0df, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it (to 64/128) later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-11-26 23:19:15 +03:00
Benjamin Kramer	5cfc22cafe	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes" This reverts commit cf624b23bc5d5a6161706d1663def49380ff816a. It triggers crashes in clang, see the comments on github on the original change.	2022-11-23 13:11:16 +01:00
Roman Lebedev	655d857325	[SROA] `isVectorPromotionViable()`: avoid allowing overly large vectors Otherwise, `compiler-rt/test/asan/TestCases/pr33372.cpp` fails with an assertion: ``` clang-16: /repositories/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:11988: void llvm::SelectionDAG::createOperands(llvm::SDNode *, ArrayRef<llvm::SDValue>): Assertion `SDNode::getMaxNumOperands() >= Vals.size() && "too many operands to fit into SDNode"' failed. ``` I'm not sure if this should be even more conservative, or if we have a named constant for this in middle-end.	2022-11-23 03:23:08 +03:00
Roman Lebedev	cf624b23bc	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-11-23 02:38:25 +03:00
Roman Lebedev	6a77477d53	[NFC][SROA] Autogenerate check lines in some tests being affected by upcoming change	2022-11-23 02:38:25 +03:00
Roman Lebedev	13e605051f	[NFC][SROA] Add some more tests for vector promotion	2022-11-23 02:38:25 +03:00
Roman Lebedev	529eafd9be	[SROA] `isVectorPromotionViable()`: integer-ify non-pointer non-common types This rectifies a FIXME that dates all the way back to 2014 about not doing so due to the backend issues. Presumably sufficient amount of time has passes and all the known issues have been addressed, or at least we will find out of there are some left...	2022-11-23 00:23:00 +03:00
Roman Lebedev	4e18d51ac5	[SROA] `isVectorPromotionViable()`: pointer-ness is sticky As it has been established previously by precedent, if we see a pointer type, then that is the type we must use. Essentially, we don't want to introduce `inttoptr`'s.	2022-11-23 00:23:00 +03:00
Roman Lebedev	950f248630	[NFC][SROA] Add more tests with non-speculatable `load`s of `select`s	2022-11-18 00:41:30 +03:00
Roman Lebedev	be1f994311	[Analysis] `isSafeToLoadUnconditionally()`: `lifetime` intrinsics can be ignored In practice this means that we can speculate more loads in SROA. This e.g. comes up in https://godbolt.org/z/G8716s6sj, although we are missing second half of the puzzle to optimize that.	2022-11-17 20:48:27 +03:00
Roman Lebedev	6a3561d2d3	[NFC][SROA] Add test for select speculation failures	2022-11-17 20:48:27 +03:00
Arthur Eubanks	6219ec07c6	[SROA] Don't speculate phis with different load user types Fixes an SROA crash. Fallout from opaque pointers since with typed pointers we'd bail out at the bitcast. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D136119	2022-10-18 08:44:13 -07:00
Nikita Popov	2813bc5d24	[SROA] Regenerate test checks (NFC)	2022-10-05 10:31:03 +02:00
Douglas Yung	91e0423595	Revert "[SROA] Create additional vector type candidates based on store and load slices" This reverts commit de3445e0ef15c420955ad720fccf08473f460443. This is causing GHI #57796 and #57821.	2022-09-23 12:24:07 -07:00
Douglas Yung	0a7f4e03a9	Revert "[SROA] Check typeSizeEqualsStoreSize in isVectorPromotionViable" This reverts commit 3f08d248c44c744deda38423409b86720822739e. The commit this change is fixing is being reverted due to GHI #57796 and #37821, so revert this commit as well.	2022-09-23 12:24:07 -07:00
Bjorn Pettersson	3f08d248c4	[SROA] Check typeSizeEqualsStoreSize in isVectorPromotionViable Commit de3445e0ef15c4209 (https://reviews.llvm.org/D132096) made changes to isVectorPromotionViable basically doing // Create Vector with size of V, and each element of type Ty ... uint64_t ElementSize = DL.getTypeStoreSizeInBits(Ty).getFixedSize(); uint64_t VectorSize = DL.getTypeSizeInBits(V).getFixedSize(); ... VectorType VTy = VectorType::get(Ty, VectorSize / ElementSize, false); Not quite sure why it uses the TypeStoreSize for the ElementSize, but the new vector would only match in size with the old vector in situations when the TypeStoreSize equals the TypeSize for Ty. Therefore this patch adds a typeSizeEqualsStoreSize check as yet another condition for allowing the the new type as a promotion candidate. Without this fix the new @test15 test would fail with an assert like this: opt: ../lib/Transforms/Scalar/SROA.cpp:1966: auto isVectorPromotionViable(llvm::sroa::Partition &, const llvm::DataLayout &) ::(anonymous class)::operator()(llvm::VectorType , llvm::VectorType *) const: Assertion `DL.getTypeSizeInBits(RHSTy).getFixedSize() == DL.getTypeSizeInBits(LHSTy).getFixedSize() && "Cannot have vector types of different sizes!"' failed. ... #8 isVectorPromotionViable(...)::$_10::operator()... #9 llvm::SROAPass::rewritePartition(...) #10 llvm::SROAPass::splitAlloca(...) #11 llvm::SROAPass::runOnAlloca(...) #12 llvm::SROAPass::runImpl(...) #13 llvm::SROAPass::run(...) Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D134032	2022-09-21 09:45:05 +02:00
A-Wadhwani	de3445e0ef	[SROA] Create additional vector type candidates based on store and load slices This patch adds additional vector types to be considered when doing promotion in SROA, based on the types of the store and load slices. This provides more promotion opportunities, by potentially using an optimal "intermediate" vector type. For example, the following code would currently not be promoted to a vector, since `__m128i` is a `<2 x i64>` vector. ``` __m128i packfoo0(int a, int b, int c, int d) { int r[4] = {a, b, c, d}; __m128i rm; std::memcpy(&rm, r, sizeof(rm)); return rm; } ``` ``` packfoo0(int, int, int, int): mov dword ptr [rsp - 24], edi mov dword ptr [rsp - 20], esi mov dword ptr [rsp - 16], edx mov dword ptr [rsp - 12], ecx movaps xmm0, xmmword ptr [rsp - 24] ret ``` By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can provide us more promotion opportunities. Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed. Differential Revision: https://reviews.llvm.org/D132096	2022-09-12 09:55:37 -07:00
Vang Thao	257251247a	[SROA] Try harder to find a vector promotion viable type when rewriting We are seeing significant performance loss when an alloca fails to get promoted to register. I have observed that this is due to the common type found when attempting to rewrite partition users being unviable for promotion. While if we would have continue looking for a type, we would have found a subtype in the original allocated type that would have enabled promotion. Thus first check if the initial common type found is promotion viable and if not then continue looking instead of stopping with the initial common type found. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D128073	2022-08-08 11:04:01 -07:00
Nikita Popov	60a32157a5	[Tests] Remove unnecessary bitcasts from opaque pointer tests (NFC) Previously left these behind due to the required instruction renumbering, drop them now. This more accurately represents opaque pointer input IR. Also drop duplicate opaque pointer check lines in one SROA test.	2022-06-22 14:15:46 +02:00
Nikita Popov	74e652786b	[SROA] Migrate tests to opaque pointers (NFC) Tests were updated with this script: https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 However, in this case a lot of fixup was required, due to many minor, but ultimately immaterial differences in results. In particular, the GEP representation changes slightly in many cases, either because we now use an i8 GEP, or because we now leave a GEP alone, using it's original index types and (lack of) inbounds. basictest-opaque-ptrs.ll has been dropped, because it was an opaque pointers duplicate of basictest.ll.	2022-06-21 12:54:52 +02:00
Nikita Popov	ab088de873	[SROA] Regenerate test checks (NFC)	2022-06-21 12:24:11 +02:00
Nuno Lopes	5a132499fb	[NFC] Remove straight UB from SROA tests Including 'br undef', store/load to undef pointers. Plus some cosmetics: select undef, insertvalue undef -> poison. Recommit c1b6103 with fix.	2022-06-13 08:59:07 +01:00
Kazushi (Jam) Marukawa	a43c55dcd7	Revert "[NFC] Remove 'br i1 undef' from SROA tests" Transforms/SROA/vector-promotion-different-size.ll causes errors. This reverts commit c1b610307df22d12687bde26919e45752c33ab0b.	2022-06-13 12:32:25 +09:00
Nuno Lopes	c1b610307d	[NFC] Remove 'br i1 undef' from SROA tests	2022-06-12 15:29:59 +01:00
Dmitry Vassiliev	7759680e2f	[SROA] Avoid postponing rewriting load/store by ignoring lifetime intrinsics in partition's promotability checking This patch fixes a bug that generates unnecessary packing/unpacking structure code because of incorrectly handling lifetime intrinsic. For example, a partition of an alloca may contain many slices: ``` Partition [0, 4): Slice0: [0, 4) used by: load i32 addr; Slice1: [0, 4) used by: store i32 v, addr; Slice2: [0, 16) used by lifetime.start(16, addr); ``` When SROA determines if the partition can be promoted, lifetime.start is currently treated as a whole alloca load/store, so Slice0 and Slice1 cannot be promoted at this attempt, but the packing/unpacking code for Slice0 and Slice1 has been generated. After rewrite lifetime.start/end intrinsic, SROA tries again with Slice0 and Slice1 and finally promotes them, but redundant packing/unpacking code remaining in the IRs. This patch changes promotability checking to ignore lifetime intrinsic (they will be rewritten to correct sizes later), so we can promote the real users (load/store) at the first attempt with optimal code. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124967	2022-05-17 11:25:59 +02:00
Dmitry Vassiliev	9983b978f7	[SROA] Precommit test for D124967	2022-05-17 11:23:31 +02:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Dávid Bolvanský	872f7000fc	Revert "[NFCI] Regenerate SROA/LoopVectorize test checks" This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.	2022-04-04 01:15:30 +02:00

1 2 3 4 5 ...

292 Commits