llvm-project

Author	SHA1	Message	Date
Florian Hahn	b9cd48f96a	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit df75183d70e029352a49c93f275db703c81a65c1. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio	2024-03-27 21:22:15 +00:00
Julian Nagele	df75183d70	[TBAA] Add verifier for tbaa.struct metadata (#86709 ) Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-03-27 10:30:27 +01:00
Arthur Eubanks	eae4f56cb4	[SROA] Fix phi gep unfolding with an alloca not in entry block Fixes a crash reported in #83494.	2024-03-07 07:23:48 +00:00
Jeffrey Byrnes	1e828f838c	[SROA]: Only defer trying partial sized ptr or ptr vector types Change-Id: Ic77f87290905addadd5819dff2d0c62f031022ab	2024-03-05 08:52:07 -08:00
Arthur Eubanks	8848258f7b	[SROA] Unfold gep of index phi (round 2) (#83494 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983. This was initially was #83087. Initial PR did not handle allocas in entry block that weren't at the beginning of the function, causing GEPs to be inserted after the first chunk of allocas but potentially before an alloca not at the beginning. Insert GEPs at the end of the entry block instead since constants/arguments/static allocas can all be used there.	2024-03-04 14:21:26 -08:00
Arthur Eubanks	de8e2b7b86	[test][SROA] Regenerate vector-promotion.ll	2024-02-29 18:53:25 +00:00
Fangrui Song	43b7dfcc1d	Revert "[SROA] Unfold gep of index phi (#83087 )" This reverts commit 2eb63982e88b9ed8336158d35884b1a1d04a0f78. This caused verifier error ``` Instruction does not dominate all uses! ``` for some projects using Halide. The verifier error happens inside `Halide::Internal::CodeGen_LLVM::optimize_module` and looks like a genuine SROA issue.	2024-02-28 15:56:43 -08:00
Arthur Eubanks	2eb63982e8	[SROA] Unfold gep of index phi (#83087 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983.	2024-02-28 10:53:47 -08:00
Stephen Tozer	d128448efd	Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )"" Reverted due to some test failures on some buildbots. https://lab.llvm.org/buildbot/#/builders/67/builds/14669 This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.	2024-02-27 10:17:24 +00:00
Stephen Tozer	aa436493ab	Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Fixes the prior issue in which the symbol for a cl-arg was unavailable to some binaries. This reverts commit dc06d75ab27b4dcae2940fc386fadd06f70faffe.	2024-02-27 09:59:08 +00:00
Stephen Tozer	dc06d75ab2	Revert "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Reverted due to failures on buildbots, where a new cl flag was placed in the wrong file, resulting in link errors. https://lab.llvm.org/buildbot/#/builders/198/builds/8548 This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.	2024-02-26 18:49:18 +00:00
Stephen Tozer	0b398256b3	[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 ) This patch adds support for printing the proposed non-instruction debug info ("RemoveDIs") out to textual IR. This patch does not add any bitcode support, parsing support, or documentation. Printing of the new format is controlled by a flag added in this patch, `--write-experimental-debuginfo`, which defaults to false. The new format will be printed iff this flag is true, so whether we use the IR format is completely independent of whether we use non-instruction debug info during LLVM passes (which is controlled by the `--try-experimental-debuginfo-iterators` flag). Even with the flag disabled, some existing tests need to be updated, as this patch causes debug intrinsic declarations to be changed in a round trip, such that they always appear at the end of a module and have no attributes (this has no functional change on the module). The design of this new IR format was proposed previously on Discourse, and any further discussion about the design can still be contributed there: https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491	2024-02-26 18:22:05 +00:00
Florian Hahn	dc85719d5b	[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (#81313 ) Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on https://github.com/llvm/llvm-project/pull/81289. PR: https://github.com/llvm/llvm-project/pull/81313	2024-02-16 19:23:14 +00:00
Florian Hahn	53c0e809fa	[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289 ) If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct. InstCombine already had a similar logic. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on https://github.com/llvm/llvm-project/pull/81285.	2024-02-16 13:45:01 +00:00
Florian Hahn	2a9b86cc10	[SROA] Extend !tbaa.struct test coverage with multiple missing cases. Add tests to cover missing cases for https://github.com/llvm/llvm-project/pull/81289 and https://github.com/llvm/llvm-project/pull/81313.	2024-02-15 21:21:18 +00:00
Florian Hahn	2884d04839	[SROA] Add additional tests for splitting up ops with !tbaa.struct.	2024-02-09 17:16:21 +00:00
Nikita Popov	2f8e37d201	[SROA] Unfold gep of index select (#80983 ) SROA currently supports converting a gep of select into select of gep if the select is in the pointer operand. This patch expands support to selects in an index operand. This is intended to address the regression reported in https://github.com/llvm/llvm-project/pull/68882#issuecomment-1924909922.	2024-02-09 09:36:05 +01:00
Jeremy Morse	a643ab852a	[DebugInfo][RemoveDIs] Final omnibus test fixing for RemoveDIs (#81125 ) With this, I get a clean test suite running under RemoveDIs, the non-intrinsic representation of debug-info, including under asan. We've previously established that we generate identical binaries for some large projects, so this i just edge-case cleanup. The changes: * CodeGenPrepare fixups need to apply to dbg.assigns as well as dbg.values (a dbg.assign is a dbg.value). * Pin a test for constant-deletion to intrinsic debug-info: this very rare scenario uses a different kill-location sigil in dbg.value mode to RemoveDIs mode, which generates spurious test differences. * Suppress a memory leak in a unit test: the code for dealing with trailing debug-info in a block is necessarily fiddly, leading to this leak when testing it. Developer-facing interfaces for moving instructions around always deal with this behind the scenes. * SROA, when replacing some vector-loads, needs to insert the replacement loads ahead of any debug-info records so that their values remain dominated by a definition. Set the head-bit indicating our insertion should come before debug-info.	2024-02-08 11:49:04 +00:00
Nikita Popov	fb581adbdd	[SROA] Add tests with gep of index select (NFC)	2024-02-07 13:01:10 +01:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
Jeffrey Byrnes	f709fbb1bb	[SROA] Only try additional vector type candidates when needed (#77678 ) `f9c2a341b9` causes regressions when we have a slice with integer vector type that is the same size as the partition, and a ptr load/store slice that is not the size of the element type. Ref `vector-promotion.ll:ptrLoadStoreTys`. Before the patch, we would only consider `<4 x i32>` as a candidate type for vector promotion, and would find that it is a viable type for all the slices. After the patch, we now add `<2 x ptr>` as a candidate type due to slice with user `store ptr %val0, ptr %obj, align 8` -- and flag that we `HaveVecPtrTy`. The pre-existing behavior of this flag results in removing the viable `<4 x i32>` and keeping only the unviable `<2 x ptr>`, which results in a failure to promote. The end result is failing to promote an alloca that was previously promoted -- this does not appear to be the intent of that patch, which has the goal of increasing promotions by providing more promotion opportunities. This PR preserves this behavior via a simple reorganization of the implemention: try first the slice types with same size as the partition, then, if there is no promotable type, try the `LoadStoreTys.`	2024-01-23 17:22:49 -08:00
Jeffrey Byrnes	766e645d8d	[SROA] NFC: Precommit test for pull/77678 Change-Id: I6b2346301f9bd840a0adceba4a0d03e9932af245	2024-01-23 16:37:35 -08:00
Nikita Popov	54067c5fbe	[SROA] Use memcpy if type size does not match store size The original memcpy also copies the padding, so make sure that this is still the case after splitting. Fixes https://github.com/llvm/llvm-project/issues/64081.	2023-12-22 10:19:22 +01:00
Nikita Popov	3f199cb14c	[SROA] Add test for #64081 (NFC)	2023-12-22 10:19:21 +01:00
Jeremy Morse	d2d9dc8eb4	[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251 ) Debugify is extremely useful as a testing and debugging tool, and a good number of LLVM-IR transform tests use it. We need it to support "new" non-instruction debug-info to get test coverage, but it's not important enough to completely convert right now (and it'd be a large undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and exit of the pass, which gives us the functionality without any further work. The cost is compile-time, but again this is only happening during tests. Tested by: the large set of debugify tests enabled here. Note the InstCombine test (cast-mul-select.ll) that hasn't been fully enabled: this is because there's a debug-info sinking piece of code there that hasn't been instrumented.	2023-11-29 13:19:50 +00:00
Jeremy Morse	80d3a4c39f	[DebugInfo][RemoveDIs] Add local-utility plumbing for DPValues (#72276 ) This patch re-implements a variety of debug-info maintenence functions to use DPValues instead of DbgValueInst's: supporting the "new" non-intrinsic representation of debug-info. As per [0], we need to have parallel implementations of various utilities for a time, and these are the most fundamental utilities used throughout the compiler. I've added --try-experimental-debuginfo-iterators to a variety of RUN lines: this is a flag that turns on "new debug-info" if it's built into LLVM, and not otherwise. This should ensure that we have the same behaviour for the same IR inputs, but using a different internal representation. For the most part these changes affect SROA/Mem2Reg promotion of dbg.declares into dbg.value intrinsics (now DPValues), we're leaving dbg.declares as instructions until later in the day. There's also some salvaging changes made. I believe the tests that I've added cover almost all the code being updated here. The only thing I'm not confident about is SimplifyCFG, which calls rewriteDebugUsers down a variety of code paths. Those changes can't immediately get full coverage as an additional patch is needed that updates handling of Unreachable instructions, will upload that shortly. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939/9	2023-11-20 16:56:31 +00:00
Antonio Frighetto	ebbb9cdbb3	[SROA] Allow `llvm.launder.invariant.group` intrinsic to be splittable Let `llvm.launder.invariant.group` intrinsic as well as instructions operating on memory addresses, whose invariance may be broken by the intrinsic, to be rewritten. Fixes: https://github.com/llvm/llvm-project/issues/72035.	2023-11-16 17:56:34 +01:00
Nikita Popov	ed86e740ef	Revert "[SROA] Limit the number of allowed slices when trying to split allocas" This reverts commit e13e808283f7fd9e873ae922dd1ef61aeaa0eb4a. This causes performance regressions on GPU targets, see https://github.com/llvm/llvm-project/issues/69785. Revert the change for now.	2023-11-09 16:38:52 +01:00
Nikita Popov	57a554800b	[SROA] Don't shrink volatile load past end For volatile atomic, this may result in a verifier errors, if the new alloca type is not legal for atomic accesses. I've opted to disable this special case for volatile accesses in general, as changing the size of the volatile access seems dubious in any case. Fixes https://github.com/llvm/llvm-project/issues/64721.	2023-09-20 14:12:31 +02:00
Paul Walker	c7d65e4466	[IR] Enable load/store/alloca for arrays of scalable vectors. Differential Revision: https://reviews.llvm.org/D158517	2023-09-14 13:49:01 +00:00
Jeremy Morse	1ce1732f82	[DebugInfo] Use getStableDebugLoc to pick IRBuilder DebugLocs When IRBuilder is given an insertion position and there is debug-info, it sets the DebugLoc of newly inserted instructions to the DebugLoc of the insertion position. Unfortunately, that means if you insert in front of a debug intrinsics, your "real" instructions get potentially-misleading source locations from the debug intrinsics. Worse, if you compile -gmlt to get source locations but no variable locations, you'll get different source locations to a normal -g build, which is silly. Rectify this with the getStableDebugLoc method, which skips over any debug intrinsics to find the next "real" instruction. This is the source location that you would get if you compile with -gmlt, and it remains stable in the presence of debug intrinsics. The changed tests show a few locations where this has been happening, for example selecting line-zero locations for instrumentation on a perfectly valid call site. Differential Revision: https://reviews.llvm.org/D159485	2023-09-11 19:00:44 +01:00
Dhruv Chawla	e13e808283	[SROA] Limit the number of allowed slices when trying to split allocas This patch adds a hidden CLI option "--sroa-max-alloca-slices", which is an integer that controls the maximum number of alloca slices SROA can consider before bailing out. This is useful because it may not be profitable to split memcpys into (possibly tens of) thousands of loads/stores. This also prevents an issue with exponential compile time explosion in passes like DSE and MemCpyOpt caused by excessive alloca splitting. Fixes https://github.com/rust-lang/rust/issues/88580. Differential Revision: https://reviews.llvm.org/D159354	2023-09-09 11:00:47 +05:30
Zhongyunde	c570531c3d	[SROA] Skip uses of allocas where the type is scalable When visiting load and store instructions in SROA skip scalable vectors. This is relevant in the implementation of the 'arm_sve_vector_bits' attribute that is used to define VLS types, similar to D85725. Fix https://gcc.godbolt.org/z/o561P9zj4 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D158631	2023-08-24 18:44:09 +08:00
eopXD	39a41c8905	[CGCall][RISCV] Handle function calls with parameter of RVV tuple type This was an oversight in D146872, where function calls with tuple type was not covered. This commit fixes this. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157953	2023-08-22 23:41:23 -07:00
Krzysztof Drewniak	faa2c678aa	[AMDGPU] Add buffer intrinsics that take resources as pointers In order to enable the LLVM frontend to better analyze buffer operations (and to potentially enable more precise analyses on the backend), define versions of the raw and structured buffer intrinsics that use `ptr addrspace(8)` instead of `<4 x i32>` to represent their rsrc arguments. The new intrinsics are named by replacing `buffer.` with `buffer.ptr`. One advantage to these intrinsic definitions is that, instead of specifying that a buffer load/store will read/write some memory, we can indicate that the memory read or written will be based on the pointer argument. This means that, for example, a read from a `noalias` buffer can be pulled out of a loop that is modifying a distinct buffer. In the future, we will define custom PseudoSourceValues that will allow us to package up the (buffer, index, offset) triples that buffer intrinsics contain and allow for more precise backend analysis. This work also enables creating address space 7, which represents manipulation of raw buffers using native LLVM load and store instructions. Where tests simply used a buffer intrinsic while testing some other code path (such as the tests for VGPR spills), they have been updated to use the new intrinsic form. Tests that are "about" buffer intrinsics (for instance, those that ensure that they codegen as expected) have been duplicated, either within existing files or into new ones. Depends on D145441 Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D147547	2023-06-05 16:59:07 +00:00
eopXD	c8eb535aed	[1/11][IR] Permit load/store/alloca for struct of the same scalable vector type This patch-set aims to simplify the existing RVV segment load/store intrinsics to use a type that represents a tuple of vectors instead. To achieve this, first we need to relax the current limitation for an aggregate type to be a target of load/store/alloca when the aggregate type contains homogeneous scalable vector types. Then to adjust the prolog of an LLVM function during lowering to clang. Finally we re-define the RVV segment load/store intrinsics to use the tuple types. The pull request under the RVV intrinsic specification is riscv-non-isa/rvv-intrinsic-doc#198 --- This is the 1st patch of the patch-set. This patch is originated from D98169. This patch allows aggregate type (StructType) that contains homogeneous scalable vector types to be a target of load/store/alloca. The RFC of this patch was posted in LLVM Discourse. https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 The main changes in this patch are: Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to accommodate an expression of scalable size. Allow `StructType:isSized` to also return true for homogeneous scalable vector types. Let `Type::isScalableTy` return true when `Type` is `StructType` and contains scalable vectors Extra description is added in the LLVM Language Reference Manual on the relaxation of this patch. Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper, nikic Differential Revision: https://reviews.llvm.org/D146872	2023-05-19 09:39:36 -07:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
DianQK	2832d7941f	[SROA] Remove UB-implying metadata when promoting speculative instruction. After D138238 introduced the then/else blocks, we should remove UB-implying metadata for the promoted speculative instruction. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148456	2023-04-16 22:35:52 +08:00
Nikita Popov	fc6e91fe81	[Local] Handle size mismatch between pointer/int in copyRangeMetadata() SROA may convert a wide integer load into a narrow pointer load, make sure we don't crash. It would not be legal to transfer the metadata in this case.	2023-03-31 12:20:34 +02:00
Han Zhu	f9c2a341b9	[SROA] Create additional vector type candidates based on store and load slices Second try at A-Wadhwani's https://reviews.llvm.org/D132096, which was reverted. The original patch had three issues: * https://reviews.llvm.org/D134032, which bjope kindly fixed. That patch is merged into this one. * [GHI #57796](https://github.com/llvm/llvm-project/issues/57796). Fixed and added a test. * [GHI #57821](https://github.com/llvm/llvm-project/issues/57821). I believe this is an undefined behavior which is not the fault of the original patch. Please see the issue for more details. Original diff summary: This patch adds additional vector types to be considered when doing promotion in SROA, based on the types of the store and load slices. This provides more promotion opportunities, by potentially using an optimal "intermediate" vector type. For example, the following code would currently not be promoted to a vector, since `__m128i` is a `<2 x i64>` vector. ``` __m128i packfoo0(int a, int b, int c, int d) { int r[4] = {a, b, c, d}; __m128i rm; std::memcpy(&rm, r, sizeof(rm)); return rm; } ``` ``` packfoo0(int, int, int, int): mov dword ptr [rsp - 24], edi mov dword ptr [rsp - 20], esi mov dword ptr [rsp - 16], edx mov dword ptr [rsp - 12], ecx movaps xmm0, xmmword ptr [rsp - 24] ret ``` By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can provide us more promotion opportunities. Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed. Differential Revision: https://reviews.llvm.org/D143225	2023-03-08 12:01:31 -08:00
Han Zhu	d888496e3c	[SROA] Fix bug where RankVectorTypes is used in std::unique `RankVectorTypes` is a not an equivalence relation so when it is used in `std::unique`, the behavior is undefined. Create `RankVectorTypesEq` and use that instead.	2023-03-07 14:10:58 -08:00
J. Ryan Stinnett	0bbe6040be	[DebugInfo] Remove `dbg.addr` from Transforms Part of `dbg.addr` removal Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898 Differential Revision: https://reviews.llvm.org/D144797	2023-03-02 09:29:43 +00:00
Sander de Smalen	0d94b63604	[IR] Add LLVM IR support for target("aarch64.svcount") type. The C and C++ Language Extensions for AArch64 SME2 [1] adds a new type called `svcount_t` which describes a predicate. This is not a predicate vector mask, but rather a description of a predicate vector mask that can be expanded into a mask using explicit instructions. The type is a scalable opaque type. To implement `svcount_t` type this patch uses the existing Target Extension Type mechanism, but adds further support so that this type can be a scalable type. AArch64 CodeGen support will follow in a separate patch. [1] https://github.com/ARM-software/acle/pull/217 Reviewed By: jcranmer-intel, nikic Differential Revision: https://reviews.llvm.org/D136861	2023-03-01 08:17:53 +00:00
Han Zhu	66b8d2bb71	[SROA] Pre-commit vector-promotion.ll tests for D143225	2023-02-08 10:30:48 -08:00
Nikita Popov	e6241cbdcb	[Mem2Reg] Only convert !nonnull to assume if !noundef present After D141386 !nonnull violation returns poison rather than resulting in immediate undefined behavior. However, converting it into an assume would result in IUB. As such, we can only perform this transform if !noundef is also present.	2023-01-20 16:38:26 +01:00
Nikita Popov	a4898b437d	[Local] Preserve range metadata if the type did not change In copyRangeMetadata() and by extension copyLoadMetadata(), handle the trivial case where the type did not change, in which case we can simply preserve the range metadata as is.	2023-01-20 15:28:32 +01:00
Nikita Popov	d49b842ea2	[SROA] Use copyMetadataForLoad() helper Instead of copying just nonnull metadata, use the generic helper to copy metadata to the new load. This helper is specifically designed for the case where the load type may change, so it's safe to use in this context.	2023-01-20 15:24:10 +01:00
Nikita Popov	269cfd3156	[SROA] Add additional metadata preservation tests (NFC)	2023-01-20 15:07:33 +01:00
Nikita Popov	72dc033fa6	[SROA] Check TBAA metadata in tests (NFC) By switching to --check-globals. Also make sure that the !tbaa.struct metadata mapping is preserved.	2023-01-19 17:15:56 +01:00
Roman Lebedev	1578c670ff	[NFC][SROA] Variably-indexed load: add test variation w/ upper half of alloca being zeros This is the actual pattern i'm looking at.	2022-12-23 20:16:41 +03:00

1 2 3 4 5 ...

341 Commits