llvm-project

Author	SHA1	Message	Date
Florian Hahn	e77378cc14	[Matrix] Adjust lifetime.ends during multiply fusion. (#84914 ) At the moment, loads introduced by multiply fusion may be placed after an objects lifetime has been terminated by lifetime.end. This introduces reads to dead objects. To avoid this, first collect all lifetime.end calls in the function. During fusion, we deal with any lifetime.end calls that may alias any of the loads. Such lifetime.end calls are either moved when possible (both the lifetime.end and the store are in the same block) or deleted. PR: https://github.com/llvm/llvm-project/pull/84914	2024-03-16 20:41:36 +01:00
Florian Hahn	d96d917f38	[Matrix] Add tests showing mis-compile with lifetime.end and fusion. Add a set of tests showing miscompiles due to multiply fusion introducing loads to dead objects after lifetime.end.	2024-03-12 13:33:55 +00:00
Florian Hahn	dbe4143f23	[Matrix] Fix dimensions when hoisting transpose across add. (#81507 ) Row and column arguments for matrix_transpose indicate the shape of the operand. When hoisting the transpose to the result of the add, the add operates on the original operand's shape, and so does the hoisted transpose. This patch also adds an assert that the shape for the original add and the transpose match, as well as the shape of the new add matches the cached shape for it. The assert could potentially be moved to updateShapeAndReplaceAllUsesWith.	2024-02-12 18:45:13 +00:00
Florian Hahn	673e5e34b4	[Matrix] Add dedicated tests for transpose lifting. Add extra test coverage for transpose lifting using -matrix-print-after-transpose-opt. The added tests show a mis-compile.	2024-02-12 16:19:31 +00:00
Alexey Bataev	7bc079c852	[TTI]Fallback to SingleSrcPermute shuffle kind, if no direct estimation for extract subvector. Many targets do not have cost for extractsubvector shuffle kind, but have the costs for single source permute. If there are no costs estimation for extractsubvector, better to switchto single source permute for better cost estimation. Reviewers: RKSimon, davemgreen, arsenm Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/79837	2024-02-12 07:09:49 -05:00
Florian Hahn	f89fe08d77	[Matrix] Convert column-vector ops feeding dot product to row-vectors. (#72647 ) Generalize the logic used to convert column-vector ops to row-vectors to support converting chains of operations. A potential next step is to further generalize this to convert column-vector ops to row-vector ops in general, not just for operands of dot products. Dot-product handling would then be driven by the general conversion, rather than the other way around. PR: https://github.com/llvm/llvm-project/pull/72647	2024-02-06 13:47:31 +00:00
Nikita Popov	90ba33099c	[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882 ) This patch canonicalizes getelementptr instructions with constant indices to use the `i8` source element type. This makes it easier for optimizations to recognize that two GEPs are identical, because they don't need to see past many different ways to express the same offset. This is a first step towards https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699. This is limited to constant GEPs only for now, as they have a clear canonical form, while we're not yet sure how exactly to deal with variable indices. The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives two representative examples of the kind of optimization improvement we expect from this change. In the first test SimplifyCFG can now realize that all switch branches are actually the same. In the second test it can convert it into simple arithmetic. These are representative of common optimization failures we see in Rust. Fixes https://github.com/llvm/llvm-project/issues/69841.	2024-01-24 15:25:29 +01:00
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
Nikita Popov	d0605e21af	[InstCombine] Canonicalize splat shuffles to use poison operand If the splat shuffle is represented using an undef RHS, replace it with poison.	2023-12-18 15:57:49 +01:00
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
David Green	2a859b2014	[AArch64] Change the cost of vector insert/extract to 2 The cost of vector instructions has always been high under AArch64, in order to add a high cost for inserts/extracts, shuffles and scalarization. This is a conservative approach to limit the scope of unusual SLP vectorization where the codegen ends up being quite poor, but has always been higher than the correct costs would be for any specific core. This relaxes that, reducing the vector insert/extract cost from 3 to 2. It is a generalization of D142359 to all AArch64 cpus. The ScalarizationOverhead is also overridden for integer vector at the same time, to remove the effect of lane 0 being considered free for integer vectors (something that should only be true for float when scalarizing). The lower insert/extract cost will reduce the cost of insert, extracts, shuffling and scalarization. The adjustments of ScalaizationOverhead will increase the cost on integer, especially for small vectors. The end result will be lower cost for float and long-integer types, some higher cost for some smaller vectors. This, along with the raw insert/extract cost being lower, will generally mean more vectorization from the Loop and SLP vectorizer. We may end up regretting this, as that vectorization is not always profitable. In all the benchmarking I have done this is generally an improvement in the overall performance, and I've attempted to address the places where it wasn't with other costmodel adjustments. Differential Revision: https://reviews.llvm.org/D155459	2023-07-28 21:26:50 +01:00
Nikita Popov	bc39a7a5e4	[LowerMatrixIntrinsics] Fix test expectations (NFC) Some of the test expectation were incorrectly changed in 23c21759458014fc4d7cbea45b6fbe7349a0a4fd. Regenerate the tests.	2023-07-18 11:21:11 +02:00
Nuno Lopes	23c2175945	[LowerMatrixIntrinsics] Use poison instead of undef as placeholder [NFC] These values don't propagate to the output; they are always replaced with a subsequent shuffle or insertelement. Tested equivalence with Alive2, e.g., https://alive2.llvm.org/ce/z/fj4s78.	2023-07-18 09:54:41 +01:00
Florian Hahn	c10a7772bd	[Matrix] Convert binop operand of dot product to a row vector. The dot product lowering will use the left operand as row vector. If the operand is a binary op, convert it to operate on a row vector instead of a column vector. Depends on D148428. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D148429	2023-06-07 20:45:08 +01:00
Florian Hahn	ebbcbb2af5	[Matrix] Remove redundant transpose with dot product lowering. Extend dot-product handling to skip transposes of the first operand. As this is a vector, the conversion between column and row vector via the transpose isn't needed. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D148428	2023-05-14 22:07:38 +01:00
Florian Hahn	0e8717f711	[Matrix] Add shape verification. At the moment, lower-matrix-intrinsics accepts mis-matches between shapes for operations. See shape-verification.ll for an example where @llvm.matrix.column.major.load specifies 6x1 and then the use (@llvm.matrix.multiply) specifies the operand to have 1x6. This patch adds verification for shapes to check if shapes match. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D147438	2023-05-13 09:41:27 +01:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
Florian Hahn	f10153fe91	[Matrix] Handle integer types when distributing transposes across adds. The current code did not properly account for integer matrixes. Check if the operands are floating point or integer matrixes and use FAdd/Add accordingly. This is already done for other cases, like multiplies. Fixes #62281.	2023-04-21 16:35:11 +01:00
Florian Hahn	a25b962a7f	[Matrix] Split off transpose + dot product tests.	2023-04-15 14:06:47 +01:00
Florian Hahn	98e50881e9	[Matrix] Refine cost estimate for dot-product. Adjust lowerDotProduct cost estimate to include the cost benefits of: * emitting a wide load * emitting a wide multiply. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D147330	2023-04-14 11:35:01 +01:00
Florian Hahn	677b0d33e3	[Matrix] Add dot product tests with builtin loads with variable strides Extra tests for D147330.	2023-04-14 10:40:47 +01:00
Florian Hahn	e6ab86a887	[Matrix] Fix IsSupported check in lowerDotProduct. The check incorrectly checks the RHS while LHS is transformed later. Update to check LHS, which fixes a crash in the newly added test cases.	2023-04-13 19:00:30 +01:00
Florian Hahn	78148eba49	[Matrix] Fix crash during dot product lowering. Perform dot-product lowering before instruction fusion to avoid crash in newly added test. Also update lowerDotProduct to properly mark optimized matmul as fused.	2023-04-12 15:08:39 +01:00
Florian Hahn	04681243b4	[Matrix] Limit dot lowering to column major matrixes. Limit to dot product lowering to column major matrixes for now. This simplifies the code and reasoning for upcoming planned improvements. Support for row-major matrixes can be added later as extension.	2023-04-05 15:49:06 +01:00
Florian Hahn	17fc38889a	[Matrix] Add dotproduct tests with row-major default layout.	2023-04-05 15:19:11 +01:00
Florian Hahn	2f21659ee9	[Matrix] Add test variants where 2nd operand of dotprod is add/sub.	2023-04-05 15:04:05 +01:00
Florian Hahn	c0dbe85790	[Matrix] Fix shapes in dot product tests. The shape arguments for the @llvm.matrix.column.major.load where incorrect. Flip them so they are in sync with the shape of the multiplications.	2023-04-03 12:50:05 +01:00
Vir Narula	e7281c6f61	[Matrix] Add special case dot product lowering Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many `shufflevector` instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions. Reviewed By: fhahn, thegameg Differential Revision: https://reviews.llvm.org/D131125	2023-03-31 12:40:20 +01:00
Florian Hahn	16a008bbde	[Matrix] Update most dot tests using vXi64 to vXi32. Update dot-product-int.ll tests to use mostly i32 instead of i64; there's no mul.2d instruction, so vector versions of v2i64 cannot be lowered efficiently.	2023-03-31 12:32:41 +01:00
Florian Hahn	22ebb49b9f	[Matrix] Extend test coverage for dot product lowering. Extra tests: * result is used by instruction * constant vector operands * multiply fed by other math instructions * extra test with larger stride	2023-03-25 21:30:20 +00:00
Florian Hahn	18353d221d	[Matrix] Split up dot product tests into integer and float variants. To avoid the individual files getting too big with further additions.	2023-03-25 21:23:01 +00:00
Jannik Silvanus	a4753f5dc0	[IR] Avoid creation of GEPs into vectors (in one place) The method DataLayout::getGEPIndexForOffset(Type &ElemTy, APInt &Offset) allows to generate GEP indices for a given byte-based offset. This allows to generate "natural" GEPs using the given type structure if the byte offset happens to match a nested element object. With opaque pointers and a general move towards byte-based GEPs [1], this function may be questionable in the future. This patch avoids creation of GEPs into vectors in routines that use DataLayout::getGEPIndexForOffset by not returning indices in that case. The reason is that A) GEPs into vectors have been discouraged for a long time [2], and B) that GEPs into vectors are currently broken if the element type is overaligned [1]. This is also demonstrated by a lit test where previously InstCombine replaced valid loads by poison. Note that the result of InstCombine on that test is still* invalid, because padding bytes are assumed. Moreover, GEPs into vectors may be outright forbidden in the future [1]. [1]: https://discourse.llvm.org/t/67497 [2]: https://llvm.org/docs/GetElementPtr.html The test case is new. It will be precommitted if this patch is accepted. Differential Revision: https://reviews.llvm.org/D142146	2023-01-23 13:25:39 +01:00
Francis Visoiu Mistrih	da09b35334	[Matrix] Optimize matrix transposes around additions First, sink the transposes to the operands to simplify redudant ones. Then, lift them to reduce the number of realized transposes. ``` (A + B)^T -> A^T + B^T -> (A + B)^T ``` See tests for more examples. Differential Revision: https://reviews.llvm.org/D133657	2023-01-11 15:21:59 -08:00
Paul Walker	eae26b6640	[IRBuilder] Use canonical i64 type for insertelement index used by vector splats. Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation. NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64. Differential Revision: https://reviews.llvm.org/D140983	2023-01-11 14:08:06 +00:00
Matt Arsenault	256d5ad3e8	LowerMatrixIntrinsics: Convert tests to opaque pointers store-align-volatile.ll needed manually updated check lines for a -NEXT check after a deleted bitcast. Also avoided breaking the example C++ comment in remarks-inlining.ll	2022-11-27 21:42:25 -05:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Arthur Eubanks	c384b20b55	[opt] Remove temporary legacy pass name translations And update corresponding tests.	2022-10-07 11:09:46 -07:00
Francis Visoiu Mistrih	0fcc99ade4	[Matrix] Add tests for addition transpose optimizations Tests before transpose optimizations around additions. Differential Revision: https://reviews.llvm.org/D133656	2022-09-26 13:27:03 -07:00
Francis Visoiu Mistrih	81bdb4068d	[Matrix] Simplify matmuls with scalars If one of the operands is a transposed splat, the transpose can be removed. This is useful to simplify when transposes are distributed to operands of a matmul: * k^T -> k * (A * k)^t -> A^t * k Differential Revision: https://reviews.llvm.org/D130177	2022-09-02 15:50:25 -07:00
Vir Narula	625877b0ef	[Matrix] Add tests dot product with varied strides Add more tests with varied strides. Changes to lowering upcoming in https://reviews.llvm.org/D131125 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D131444	2022-08-11 19:09:21 +01:00
Nuno Lopes	022bd92c78	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-03 12:32:19 +01:00
Nuno Lopes	7c4f45f87a	Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] This reverts commits 47e6f98f84ac3 and 3e701bcd2a6aee2	2022-07-01 23:53:41 +01:00
Nuno Lopes	3e701bcd2a	attempt to fix aarch64 build bot	2022-07-01 23:43:48 +01:00
Nuno Lopes	47e6f98f84	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-01 23:31:31 +01:00
Florian Hahn	7c0089d735	[Matrix] Check if iterator is at beginning of BB in optimizeTranspose. If an instruction at the beginning of a block is erased, this may trigger crash due to dereferencing an invalid iterator. Check if II is at the end before dereferencing it. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D127736	2022-06-14 21:37:02 +01:00
Vir Narula	210c851327	[Matrix] Add dot product tests LLVM LIT tests for our upcoming dot product lowering change Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D126942	2022-06-03 20:02:42 +01:00
Johannes Doerfert	a81fff8afd	Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test changes.	2022-03-25 09:36:50 -05:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
Arthur Eubanks	dec9be85cc	[test][LowerMatrixIntrinsics] Use new PM RUN lines	2022-03-08 13:39:18 -08:00
Florian Hahn	b339bbdb19	[Matrix] Use ArrayType for allocas instead of VectorType. When creating an alloca to copy a matrix due to memory conflicts, those allocas used to use VectorTypes, which forced them to have huge alignments for large vectors. This patch updates LowerMatrixIntrinsics to use a corresponding array type, like Clang already does, to get more manageable alignments. Reviewed By: anemet, thegameg Differential Revision: https://reviews.llvm.org/D118239	2022-01-28 10:47:52 +00:00

1 2 3

106 Commits