llvm-project

Author	SHA1	Message	Date
Nikita Popov	61e0822efa	[llvm][clang] Remove uses of isOpaquePointerTy() (NFC) This now always returns true (for pointer types).	2023-07-14 10:27:58 +02:00
eopXD	c8eb535aed	[1/11][IR] Permit load/store/alloca for struct of the same scalable vector type This patch-set aims to simplify the existing RVV segment load/store intrinsics to use a type that represents a tuple of vectors instead. To achieve this, first we need to relax the current limitation for an aggregate type to be a target of load/store/alloca when the aggregate type contains homogeneous scalable vector types. Then to adjust the prolog of an LLVM function during lowering to clang. Finally we re-define the RVV segment load/store intrinsics to use the tuple types. The pull request under the RVV intrinsic specification is riscv-non-isa/rvv-intrinsic-doc#198 --- This is the 1st patch of the patch-set. This patch is originated from D98169. This patch allows aggregate type (StructType) that contains homogeneous scalable vector types to be a target of load/store/alloca. The RFC of this patch was posted in LLVM Discourse. https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 The main changes in this patch are: Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to accommodate an expression of scalable size. Allow `StructType:isSized` to also return true for homogeneous scalable vector types. Let `Type::isScalableTy` return true when `Type` is `StructType` and contains scalable vectors Extra description is added in the LLVM Language Reference Manual on the relaxation of this patch. Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper, nikic Differential Revision: https://reviews.llvm.org/D146872	2023-05-19 09:39:36 -07:00
Nikita Popov	53500e333d	Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating This exposed another miscompile in GVN, which was fixed by 20e9b31f88149a1d5ef78c0be50051e345098e41. ----- After D141386, violation of nonnull, range and align metadata results in poison rather than immediate undefined behavior, which means that these are now safe to retain when speculating. We only need to remove UB-implying metadata like noundef. This is done by adding a dropUBImplyingAttrsAndMetadata() helper, which lists the metadata which is known safe to retain on speculation. Differential Revision: https://reviews.llvm.org/D146629	2023-04-20 14:17:15 +02:00
OCHyams	571eaead17	Reapply "[Assignment Tracking] Fix fragment error for some DSE-shortened stores" This reverts commit 6db6ab4815a44bfcaabfcdd84a0ff458394f6f52 which reverts D148536. Build issues addressed in D148698.	2023-04-19 13:36:47 +01:00
OCHyams	6db6ab4815	Revert "[Assignment Tracking] Fix fragment error for some DSE-shortened stores" This reverts commit fca3e8e024f0015604d21e6f76f3e199345679c5. Buildbot: https://lab.llvm.org/buildbot/#/builders/121/builds/29766	2023-04-19 10:03:33 +01:00
OCHyams	fca3e8e024	[Assignment Tracking] Fix fragment error for some DSE-shortened stores `shortenAssignment` inserts dbg.assigns with fragments describing the dead part of a shortened store after each dbg.assign linked to the store. Without this patch it doesn't take into account that the dead part of a shortened store may be outside the bounds of a variable of a linked dbg.assign. It also doesn't correctly account for a non-zero offset in the address modifying `DIExpression` of the dbg.assign (which is possible for fragments now even though whole variables currently cannot have a non-zero offset in their alloca). Fix this by moving the dead slice into variable-space and performing an intersect of that adjusted slice with the existing fragment. This fixes a verifier error reported when building fuchsia with assignment tracking enabled: https://ci.chromium.org/ui/p/fuchsia/builders/ci/ clang_toolchain.ci.core.x64-release/b8784000953022145169/overview Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D148536	2023-04-19 09:32:09 +01:00
Krasimir Georgiev	bf7f6b4436	Revert "Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating" This reverts commit 6f7e5c0f1ac6cc3349a2e1479ac4208465b272c6. Seems to expose a miscompile in rust, possibly exposing a bug in LLVM somewhere. Investigation thread over at: https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/LLVM.20D146629.20breakage	2023-04-19 08:28:48 +00:00
Nikita Popov	6f7e5c0f1a	Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating This exposed a miscompile in GVN, which was fixed by D148129. ----- After D141386, violation of nonnull, range and align metadata results in poison rather than immediate undefined behavior, which means that these are now safe to retain when speculating. We only need to remove UB-implying metadata like noundef. This is done by adding a dropUBImplyingAttrsAndMetadata() helper, which lists the metadata which is known safe to retain on speculation. Differential Revision: https://reviews.llvm.org/D146629	2023-04-17 14:15:14 +02:00
DianQK	2832d7941f	[SROA] Remove UB-implying metadata when promoting speculative instruction. After D138238 introduced the then/else blocks, we should remove UB-implying metadata for the promoted speculative instruction. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148456	2023-04-16 22:35:52 +08:00
Kazu Hirata	1ca496bd61	Remove redundant initialization of std::optional (NFC)	2023-04-16 00:40:05 -07:00
OCHyams	9106960724	[Assignment Tracking][SROA] Don't un-poison dbg.assigns using multiple loc ops Some dbg.assigns using poison become un-poisoned in SROA. The reason this happens at all is because dbg.assigns linked to memory intrinsics use poison to indicate they can't describe the stored value, but the value becomes available after some optimisations. This needs reworking eventually, but for now we need to ensure that when it does occur we don't create invalid expressions. D147312 prevented this occuring when the dbg.assign uses DIArgLists, but that wasn't a complete fix. We also need to ensure we avoid un-poisoning when the existing expression uses more than one location operand (DW_OP_arg, n). Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D148020	2023-04-11 18:18:11 +01:00
OCHyams	086635d6b9	[Assignment Tracking][SROA] Fix fragment when slice size equals variable size Correctly handle the case of splitting an alloca which backs contiguous distinct variables, where a slice's size equals the size of a backed variable. We need to ensure that we don't generate fragments expressions with fragments of the same size as the variable as this is a verifier error. Prior to this patch a fragment expression would be created in this situation. e.g. splitting an alloca i64 with two adjacent 32-bit variables into two 32-bit allocas, the new dbg.assign expressions would contain (DW_OP_LLVM_fragment, 0, 32) and (DW_OP_LLVM_fragment, 32, 32) even though those fragments cover each variable entirely. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147696	2023-04-06 15:29:18 +01:00
OCHyams	76740fb40e	[Assignment Tracking][SROA] Handle createFragmentExpression failure createFragmentExpression will fail if it determines that the expression cannot be split over fragments. Handle this case in SROA. Similarly to D147312 this should be a rare occurrence as the `dbg.assign` will usually reference the `Value` being stored without modifying it with a `DIExpression`. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147431	2023-04-05 11:20:32 +01:00
OCHyams	01c02cded1	[Assignment Tracking][SROA] Handle DIArgList in migrateDebugInfo If the to-be-split dbg.assign has a `DIArgList` and a new `Value` has been requested then use a kill-location for the new dbg.assign. We can't simply replace the value component (a `DIArgList`) with the new `Value` as that would leave the `DIExpression` in an invalid state (`DW_OP_LLVM_arg` operands with no arglist). Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147312	2023-03-31 12:38:47 +01:00
Han Zhu	68a07c156e	[SROA] Fix bug where CandidateTys is appended while being iterated Fix a crash when compiling Skia. See https://reviews.llvm.org/D143225#4180342 for more details	2023-03-09 01:20:08 -08:00
Han Zhu	f9c2a341b9	[SROA] Create additional vector type candidates based on store and load slices Second try at A-Wadhwani's https://reviews.llvm.org/D132096, which was reverted. The original patch had three issues: * https://reviews.llvm.org/D134032, which bjope kindly fixed. That patch is merged into this one. * [GHI #57796](https://github.com/llvm/llvm-project/issues/57796). Fixed and added a test. * [GHI #57821](https://github.com/llvm/llvm-project/issues/57821). I believe this is an undefined behavior which is not the fault of the original patch. Please see the issue for more details. Original diff summary: This patch adds additional vector types to be considered when doing promotion in SROA, based on the types of the store and load slices. This provides more promotion opportunities, by potentially using an optimal "intermediate" vector type. For example, the following code would currently not be promoted to a vector, since `__m128i` is a `<2 x i64>` vector. ``` __m128i packfoo0(int a, int b, int c, int d) { int r[4] = {a, b, c, d}; __m128i rm; std::memcpy(&rm, r, sizeof(rm)); return rm; } ``` ``` packfoo0(int, int, int, int): mov dword ptr [rsp - 24], edi mov dword ptr [rsp - 20], esi mov dword ptr [rsp - 16], edx mov dword ptr [rsp - 12], ecx movaps xmm0, xmmword ptr [rsp - 24] ret ``` By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can provide us more promotion opportunities. Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed. Differential Revision: https://reviews.llvm.org/D143225	2023-03-08 12:01:31 -08:00
Han Zhu	d888496e3c	[SROA] Fix bug where RankVectorTypes is used in std::unique `RankVectorTypes` is a not an equivalence relation so when it is used in `std::unique`, the behavior is undefined. Create `RankVectorTypesEq` and use that instead.	2023-03-07 14:10:58 -08:00
Arthur Eubanks	edd021368e	[SROA] Make order of analysis fetching more predictable For pipeline tests.	2023-03-06 09:01:29 -08:00
J. Ryan Stinnett	0bbe6040be	[DebugInfo] Remove `dbg.addr` from Transforms Part of `dbg.addr` removal Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898 Differential Revision: https://reviews.llvm.org/D144797	2023-03-02 09:29:43 +00:00
OCHyams	620a529760	[Assignment Tracking] Choose better passes for RemoveRedundantDbgInstrs call Enabling assignment tracking without this patch, a significant amount of additional compiler run time comes from the RemoveRedundantDbgInstrs call in InstCombine. This patch reduces compiler run time by choosing better places to call RemoveRedundantDbgInstrs. In non-assignment-tracking builds, RemoveRedundantDbgInstrs is called by InstCombine if LowerDbgDeclare makes a change (i.e. it is _sometimes_ called). In assignment tracking builds LowerDbgDeclare doesn't do anything. We still need to clean up redundant intrinsics to avoid a large performance hit due to the number of instructions, so the current approach is to have InstCombine _always_ call RemoveRedundantDbgInstrs. Instrumenting the compiler to run RemoveRedundantDbgInstrs after every pass and dump the numbers and building CTMark/tramp3d-v4 indicates that SROA and LoopVectorize give us a bigger bang (number removed) for buck (times pass is run). The compile time tracker reports that this patch reduces the number of instructions retired building CTMark projects by an average of 1.1%. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D144483	2023-02-22 16:28:06 +00:00
Sander de Smalen	462227f115	[SROA] NFC: Look at TypeStoreSize scalable property, rather than at type directly. Some places in the code have checks for isa<ScalableVectorType> and use that to bail out of the code. It's also possible to look directly at the allocated type-size and check if the size is scalable. This means it's possible to also support other scalable types that are not vectors (i.e. TargetExtType). This is split out from D136861.	2023-02-15 09:01:14 +00:00
OCHyams	295f5fafcb	[Assignment Tracking] Fix migrateDebuginfo in SROA Without this patch, migrateDebugInfo doesn't understand how to handle existing fragments that are smaller than the to-be-split store. This can occur if. e.g. a vector store (1 dbg.assign) is split (many dbg.assigns - 1 fragment for each scalar) and later those stores are re-vectorized (many dbg.assigns), and then SROA runs on that. The approach taken in this patch is to drop intrinsics with fragments outside of the slice. For example, starting with: store <2 x float> %v, ptr %dest !DIAssignID !1 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 0, 32), !1, ...) call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 32, 32), !1, ...) When visiting the slice of bits 0 to 31 we get: store float %v.extract.0, ptr %dest !DIAssignID !2 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 0, 32), !2, ...) The other dbg.assign associated with the currently-split store is dropped for this split part. And visiting bits 32 to 63 we get the following: store float %v.extract.1, ptr %adjusted.dest !DIAssignID !3 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 32, 32), !3, ...) I've added two tests that cover this case. Implementing this meant re-writing the fragment-calculation part of migrateDebugInfo to work with the absolute offset of the new slice in terms of the base alloca (instead of the offset of the slice into the new alloca), the fragment (if any) of the variable associated with the base alloca, and the fragment associated with the split store. Because we need the offset into the base alloca for the variables being split, some careful wiring is required for memory intrinsics due to the fact that memory intrinsics can be split when either the source or dest allocas are split. In the case where the source alloca drives the splitting, we need to be careful to pass migrateDebugInfo the information in relation to the dest alloca. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D143146	2023-02-10 18:10:11 +00:00
OCHyams	bb059e85d6	[Assignment Tracking][SROA] Delete dbg.assigns linked to rewritten stores AggLoadStoreRewriter splits aggregate loads and stores into scalars (before the alloca is split up). The new stores and debug intrinsics are already wired up correctly - we just need to also delete the dbg.assign that is linked to the split to-be-deleted store too. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D142882	2023-02-10 09:57:05 +00:00
Nikita Popov	e7fbb381c8	[SROA] Break typed pointer support This removes typed pointer support in a prominent place in the optimization pipeline, to ensure that any non-trivial consumers of tip-of-tree LLVM are aware that this is no longer a supported configuration.	2023-01-26 11:42:12 +01:00
Nikita Popov	d49b842ea2	[SROA] Use copyMetadataForLoad() helper Instead of copying just nonnull metadata, use the generic helper to copy metadata to the new load. This helper is specifically designed for the case where the load type may change, so it's safe to use in this context.	2023-01-20 15:24:10 +01:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Roman Lebedev	2cb393590e	Reland "[NFC][SROA] `speculateSelectInstLoads()`: play nice with typed pointers for now" This reverts commit bf88ba0f8718c1e89e28e977839ad0a6186d44fe, relands 9f27f4536e19e93349b0662338408efe6d1cb2fd, but without a bug: we REALLY should not be defaulting to address space 0 when address space is not specified...	2022-12-22 00:47:40 +03:00
Max Kazantsev	bf88ba0f87	Revert "[NFC][SROA] `speculateSelectInstLoads()`: play nice with typed pointers for now" This reverts commit 9f27f4536e19e93349b0662338408efe6d1cb2fd. Supposed to be NFC, but broke buildbots (test addrspacecast.ll is failing).	2022-12-21 11:21:56 +07:00
Roman Lebedev	9f27f4536e	[NFC][SROA] `speculateSelectInstLoads()`: play nice with typed pointers for now As requested in https://reviews.llvm.org/D138238#inline-1356685	2022-12-21 05:17:02 +03:00
Joshua Cranmer	e6b02214c6	[IR] Add a target extension type to LLVM. Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizations. This patch doesn't add any uses of these types by an existing backend, it only provides basic infrastructure such that these types would work correctly. Reviewed By: nikic, barannikov88 Differential Revision: https://reviews.llvm.org/D135202	2022-12-20 11:02:11 -05:00
Roman Lebedev	96d3c82645	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3)" While the PPC litte-endian miscompile did get addressed by https://reviews.llvm.org/D140046 the PPV big-endian bots are still unhappy. https://lab.llvm.org/buildbot/#/builders/93/builds/12560 This reverts commit 7bd358bcb4e358b4351c69e02ef76939e08acdc7.	2022-12-16 22:58:41 +03:00
Roman Lebedev	cfd594f8bb	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3) * This is a recommit of 3c4d2a03968ccf5889bacffe02d6fa2443b0260f, * which was reverted in 25f01d593ce296078f57e872778b77d074ae5888, because it exposed a miscompile in PPC backend, which was resolved in https://reviews.llvm.org/D140089 / cb3f415cd2019df7d14683842198bc4b7a492bc5. * which was a recommit of cf624b23bc5d5a6161706d1663def49380ff816a, * which was reverted in 5cfc22cafe3f2465e0bb324f8daba82ffcabd0df, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, 5and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-12-16 19:27:38 +03:00
OCHyams	f354716b05	Reapply [Assignment Tracking][13/*] Account for assignment tracking in SROA The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Split dbg.assign intrinsics into fragments similarly to what SROA already does for dbg.declares, except that there's many more intrinsics to split. The function migrateDebugInfo generates new dbg.assigns intrinsic for each part of a split store. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133296	2022-12-13 12:52:45 +00:00
Krasimir Georgiev	5e6351d927	Revert "[Assignment Tracking][13/*] Account for assignment tracking in SROA" This reverts commit 3bfba672afd52dfd5bde54dc8b67ec96275f9e15. Temporary revert since this potentially causes https://github.com/llvm/llvm-project/issues/59490.	2022-12-13 09:54:26 +00:00
Roman Lebedev	fd21361a79	[NFC][SROA] `rewriteMemOpOfSelect()`: play nice with typed pointers for now `89a6106ce5 (commitcomment-92824429)`	2022-12-12 21:16:14 +03:00
OCHyams	3bfba672af	[Assignment Tracking][13/*] Account for assignment tracking in SROA The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Split dbg.assign intrinsics into fragments similarly to what SROA already does for dbg.declares, except that there's many more intrinsics to split. The function migrateDebugInfo generates new dbg.assigns intrinsic for each part of a split store. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133296	2022-12-12 09:24:06 +00:00
Roman Lebedev	89a6106ce5	[SROA] Rewrite store-into-selected-address into predicated stores Same basic idea as with unfolding loads into predicated loads, but we obviously can't have speculative stores.	2022-12-10 21:07:03 +03:00
Roman Lebedev	4f7e5d2206	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node, take 2 Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238 This reverts commit 739611870d3b06605afe25cc07833f6a62de9545, and recommits 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5 with a fixed assertion - we should check that DTU is there, not just assert false...	2022-12-08 20:19:55 +03:00
Roman Lebedev	739611870d	Revert "[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node" The assertion about not modifying the CFG seems to not hold, will recommit in a bit. https://lab.llvm.org/buildbot#builders/139/builds/32412 This reverts commit 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5. This reverts commit 4f90f4ada33718f9025d0870a4fe3fe88276b3da.	2022-12-08 19:51:15 +03:00
Roman Lebedev	4f90f4ada3	[NFCI][SROA] `rewriteSelectInstLoads()`: add forgotten false into assertion	2022-12-08 19:40:35 +03:00
Roman Lebedev	03e6d9d9d1	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238	2022-12-08 16:51:32 +03:00
Matt Arsenault	27387896cf	SROA: Simplify addrspacecasted allocas with volatile accesses If the alloca is accessed through an addrspacecasted pointer, allow the normal changes on the alloca. Cast back to the original use address space instead of the new alloca's natural address space.	2022-12-02 15:20:56 -05:00
OCHyams	9517806064	Revert "[Assignment Tracking][13/*] Account for assignment tracking in SROA" This reverts commit e16d59973ffec77eeef73409570bdf04a69c2405. Buildbot failure: https://lab.llvm.org/buildbot/#/builders/236/builds/1205	2022-11-28 16:07:34 +00:00
OCHyams	5e0b29bf23	Revert "[Assignment Tracking][SROA] Follow-up for failing test" This reverts commit 285d46ef4b60c0919c00661199c1b010996cc2c1. Failing buildbot: https://lab.llvm.org/buildbot/#/builders/236/builds/1205	2022-11-28 16:07:34 +00:00
OCHyams	285d46ef4b	[Assignment Tracking][SROA] Follow-up for failing test Follow-up for D133296 / e16d59973ffec77eeef73409570bdf04a69c2405 Buildbot example: https://lab.llvm.org/buildbot/#/builders/6/builds/16989	2022-11-28 12:16:37 +00:00
OCHyams	e16d59973f	[Assignment Tracking][13/*] Account for assignment tracking in SROA The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Split dbg.assign intrinsics into fragments similarly to what SROA already does for dbg.declares, except that there's many more intrinsics to split. The function migrateDebugInfo generates new dbg.assigns intrinsic for each part of a split store. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133296	2022-11-28 11:31:59 +00:00
Roman Lebedev	25f01d593c	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2)" TableGen is still getting miscompiled on PPC buildbots. Sent a mail with request for help. This reverts commit 3c4d2a03968ccf5889bacffe02d6fa2443b0260f.	2022-11-27 00:00:06 +03:00
Roman Lebedev	3c4d2a0396	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2) This is a recommit of cf624b23bc5d5a6161706d1663def49380ff816a, which was reverted in 5cfc22cafe3f2465e0bb324f8daba82ffcabd0df, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it (to 64/128) later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-11-26 23:19:15 +03:00
Benjamin Kramer	5cfc22cafe	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes" This reverts commit cf624b23bc5d5a6161706d1663def49380ff816a. It triggers crashes in clang, see the comments on github on the original change.	2022-11-23 13:11:16 +01:00

1 2 3 4 5 ...

543 Commits