llvm-project

Author	SHA1	Message	Date
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
Nikita Popov	d0605e21af	[InstCombine] Canonicalize splat shuffles to use poison operand If the splat shuffle is represented using an undef RHS, replace it with poison.	2023-12-18 15:57:49 +01:00
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
David Green	2a859b2014	[AArch64] Change the cost of vector insert/extract to 2 The cost of vector instructions has always been high under AArch64, in order to add a high cost for inserts/extracts, shuffles and scalarization. This is a conservative approach to limit the scope of unusual SLP vectorization where the codegen ends up being quite poor, but has always been higher than the correct costs would be for any specific core. This relaxes that, reducing the vector insert/extract cost from 3 to 2. It is a generalization of D142359 to all AArch64 cpus. The ScalarizationOverhead is also overridden for integer vector at the same time, to remove the effect of lane 0 being considered free for integer vectors (something that should only be true for float when scalarizing). The lower insert/extract cost will reduce the cost of insert, extracts, shuffling and scalarization. The adjustments of ScalaizationOverhead will increase the cost on integer, especially for small vectors. The end result will be lower cost for float and long-integer types, some higher cost for some smaller vectors. This, along with the raw insert/extract cost being lower, will generally mean more vectorization from the Loop and SLP vectorizer. We may end up regretting this, as that vectorization is not always profitable. In all the benchmarking I have done this is generally an improvement in the overall performance, and I've attempted to address the places where it wasn't with other costmodel adjustments. Differential Revision: https://reviews.llvm.org/D155459	2023-07-28 21:26:50 +01:00
Nikita Popov	bc39a7a5e4	[LowerMatrixIntrinsics] Fix test expectations (NFC) Some of the test expectation were incorrectly changed in 23c21759458014fc4d7cbea45b6fbe7349a0a4fd. Regenerate the tests.	2023-07-18 11:21:11 +02:00
Nuno Lopes	23c2175945	[LowerMatrixIntrinsics] Use poison instead of undef as placeholder [NFC] These values don't propagate to the output; they are always replaced with a subsequent shuffle or insertelement. Tested equivalence with Alive2, e.g., https://alive2.llvm.org/ce/z/fj4s78.	2023-07-18 09:54:41 +01:00
Florian Hahn	c10a7772bd	[Matrix] Convert binop operand of dot product to a row vector. The dot product lowering will use the left operand as row vector. If the operand is a binary op, convert it to operate on a row vector instead of a column vector. Depends on D148428. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D148429	2023-06-07 20:45:08 +01:00
Florian Hahn	ebbcbb2af5	[Matrix] Remove redundant transpose with dot product lowering. Extend dot-product handling to skip transposes of the first operand. As this is a vector, the conversion between column and row vector via the transpose isn't needed. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D148428	2023-05-14 22:07:38 +01:00
Florian Hahn	0e8717f711	[Matrix] Add shape verification. At the moment, lower-matrix-intrinsics accepts mis-matches between shapes for operations. See shape-verification.ll for an example where @llvm.matrix.column.major.load specifies 6x1 and then the use (@llvm.matrix.multiply) specifies the operand to have 1x6. This patch adds verification for shapes to check if shapes match. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D147438	2023-05-13 09:41:27 +01:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
Florian Hahn	f10153fe91	[Matrix] Handle integer types when distributing transposes across adds. The current code did not properly account for integer matrixes. Check if the operands are floating point or integer matrixes and use FAdd/Add accordingly. This is already done for other cases, like multiplies. Fixes #62281.	2023-04-21 16:35:11 +01:00
Florian Hahn	a25b962a7f	[Matrix] Split off transpose + dot product tests.	2023-04-15 14:06:47 +01:00
Florian Hahn	98e50881e9	[Matrix] Refine cost estimate for dot-product. Adjust lowerDotProduct cost estimate to include the cost benefits of: * emitting a wide load * emitting a wide multiply. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D147330	2023-04-14 11:35:01 +01:00
Florian Hahn	677b0d33e3	[Matrix] Add dot product tests with builtin loads with variable strides Extra tests for D147330.	2023-04-14 10:40:47 +01:00
Florian Hahn	e6ab86a887	[Matrix] Fix IsSupported check in lowerDotProduct. The check incorrectly checks the RHS while LHS is transformed later. Update to check LHS, which fixes a crash in the newly added test cases.	2023-04-13 19:00:30 +01:00
Florian Hahn	78148eba49	[Matrix] Fix crash during dot product lowering. Perform dot-product lowering before instruction fusion to avoid crash in newly added test. Also update lowerDotProduct to properly mark optimized matmul as fused.	2023-04-12 15:08:39 +01:00
Florian Hahn	04681243b4	[Matrix] Limit dot lowering to column major matrixes. Limit to dot product lowering to column major matrixes for now. This simplifies the code and reasoning for upcoming planned improvements. Support for row-major matrixes can be added later as extension.	2023-04-05 15:49:06 +01:00
Florian Hahn	17fc38889a	[Matrix] Add dotproduct tests with row-major default layout.	2023-04-05 15:19:11 +01:00
Florian Hahn	2f21659ee9	[Matrix] Add test variants where 2nd operand of dotprod is add/sub.	2023-04-05 15:04:05 +01:00
Florian Hahn	c0dbe85790	[Matrix] Fix shapes in dot product tests. The shape arguments for the @llvm.matrix.column.major.load where incorrect. Flip them so they are in sync with the shape of the multiplications.	2023-04-03 12:50:05 +01:00
Vir Narula	e7281c6f61	[Matrix] Add special case dot product lowering Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many `shufflevector` instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions. Reviewed By: fhahn, thegameg Differential Revision: https://reviews.llvm.org/D131125	2023-03-31 12:40:20 +01:00
Florian Hahn	16a008bbde	[Matrix] Update most dot tests using vXi64 to vXi32. Update dot-product-int.ll tests to use mostly i32 instead of i64; there's no mul.2d instruction, so vector versions of v2i64 cannot be lowered efficiently.	2023-03-31 12:32:41 +01:00
Florian Hahn	22ebb49b9f	[Matrix] Extend test coverage for dot product lowering. Extra tests: * result is used by instruction * constant vector operands * multiply fed by other math instructions * extra test with larger stride	2023-03-25 21:30:20 +00:00
Florian Hahn	18353d221d	[Matrix] Split up dot product tests into integer and float variants. To avoid the individual files getting too big with further additions.	2023-03-25 21:23:01 +00:00
Jannik Silvanus	a4753f5dc0	[IR] Avoid creation of GEPs into vectors (in one place) The method DataLayout::getGEPIndexForOffset(Type &ElemTy, APInt &Offset) allows to generate GEP indices for a given byte-based offset. This allows to generate "natural" GEPs using the given type structure if the byte offset happens to match a nested element object. With opaque pointers and a general move towards byte-based GEPs [1], this function may be questionable in the future. This patch avoids creation of GEPs into vectors in routines that use DataLayout::getGEPIndexForOffset by not returning indices in that case. The reason is that A) GEPs into vectors have been discouraged for a long time [2], and B) that GEPs into vectors are currently broken if the element type is overaligned [1]. This is also demonstrated by a lit test where previously InstCombine replaced valid loads by poison. Note that the result of InstCombine on that test is still* invalid, because padding bytes are assumed. Moreover, GEPs into vectors may be outright forbidden in the future [1]. [1]: https://discourse.llvm.org/t/67497 [2]: https://llvm.org/docs/GetElementPtr.html The test case is new. It will be precommitted if this patch is accepted. Differential Revision: https://reviews.llvm.org/D142146	2023-01-23 13:25:39 +01:00
Francis Visoiu Mistrih	da09b35334	[Matrix] Optimize matrix transposes around additions First, sink the transposes to the operands to simplify redudant ones. Then, lift them to reduce the number of realized transposes. ``` (A + B)^T -> A^T + B^T -> (A + B)^T ``` See tests for more examples. Differential Revision: https://reviews.llvm.org/D133657	2023-01-11 15:21:59 -08:00
Paul Walker	eae26b6640	[IRBuilder] Use canonical i64 type for insertelement index used by vector splats. Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation. NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64. Differential Revision: https://reviews.llvm.org/D140983	2023-01-11 14:08:06 +00:00
Matt Arsenault	256d5ad3e8	LowerMatrixIntrinsics: Convert tests to opaque pointers store-align-volatile.ll needed manually updated check lines for a -NEXT check after a deleted bitcast. Also avoided breaking the example C++ comment in remarks-inlining.ll	2022-11-27 21:42:25 -05:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Arthur Eubanks	c384b20b55	[opt] Remove temporary legacy pass name translations And update corresponding tests.	2022-10-07 11:09:46 -07:00
Francis Visoiu Mistrih	0fcc99ade4	[Matrix] Add tests for addition transpose optimizations Tests before transpose optimizations around additions. Differential Revision: https://reviews.llvm.org/D133656	2022-09-26 13:27:03 -07:00
Francis Visoiu Mistrih	81bdb4068d	[Matrix] Simplify matmuls with scalars If one of the operands is a transposed splat, the transpose can be removed. This is useful to simplify when transposes are distributed to operands of a matmul: * k^T -> k * (A * k)^t -> A^t * k Differential Revision: https://reviews.llvm.org/D130177	2022-09-02 15:50:25 -07:00
Vir Narula	625877b0ef	[Matrix] Add tests dot product with varied strides Add more tests with varied strides. Changes to lowering upcoming in https://reviews.llvm.org/D131125 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D131444	2022-08-11 19:09:21 +01:00
Nuno Lopes	022bd92c78	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-03 12:32:19 +01:00
Nuno Lopes	7c4f45f87a	Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] This reverts commits 47e6f98f84ac3 and 3e701bcd2a6aee2	2022-07-01 23:53:41 +01:00
Nuno Lopes	3e701bcd2a	attempt to fix aarch64 build bot	2022-07-01 23:43:48 +01:00
Nuno Lopes	47e6f98f84	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-01 23:31:31 +01:00
Florian Hahn	7c0089d735	[Matrix] Check if iterator is at beginning of BB in optimizeTranspose. If an instruction at the beginning of a block is erased, this may trigger crash due to dereferencing an invalid iterator. Check if II is at the end before dereferencing it. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D127736	2022-06-14 21:37:02 +01:00
Vir Narula	210c851327	[Matrix] Add dot product tests LLVM LIT tests for our upcoming dot product lowering change Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D126942	2022-06-03 20:02:42 +01:00
Johannes Doerfert	a81fff8afd	Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test changes.	2022-03-25 09:36:50 -05:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
Arthur Eubanks	dec9be85cc	[test][LowerMatrixIntrinsics] Use new PM RUN lines	2022-03-08 13:39:18 -08:00
Florian Hahn	b339bbdb19	[Matrix] Use ArrayType for allocas instead of VectorType. When creating an alloca to copy a matrix due to memory conflicts, those allocas used to use VectorTypes, which forced them to have huge alignments for large vectors. This patch updates LowerMatrixIntrinsics to use a corresponding array type, like Clang already does, to get more manageable alignments. Reviewed By: anemet, thegameg Differential Revision: https://reviews.llvm.org/D118239	2022-01-28 10:47:52 +00:00
Nikita Popov	80110aafa0	[Tests] Fix incorrect noalias metadata Mostly this fixes cases where !noalias or !alias.scope were passed a scope rather than a scope list. In some cases I opted to drop the metadata entirely instead, because it is not really relevant to the test.	2021-09-18 20:51:00 +02:00
Bjorn Pettersson	d52f506192	[NewPM] Use parameterized syntax for a couple of more passes A couple of passes that are parameterized in new-PM used different pass names (in cmd line interface) while using the same pass class name. This patch updates the PassRegistry to model pass parameters more properly using PASS_WITH_PARAMS. Reason for the change is to ensure that we have a 1-1 mapping between class name and pass name (when disregarding the params). With a 1-1 mapping it is more obvious which pass name to use in options such as -debug-only, -print-after etc. The opt -passes syntax is changed for the following passes: early-cse-memssa => early-cse<memssa> post-inline-ee-instrument => ee-instrument<post-inline> loop-extract-single => loop-extract<single> lower-matrix-intrinsics-minimal => lower-matrix-intrinsics<minimal> This patch is not updating pass names in docs/Passes.rst. Not quite sure what the status is for that document (e.g. when it comes to listing pass paramters). It is only loop-extract-single that is mentioned in Passes.rst today, out of the passes mentioned above. Differential Revision: https://reviews.llvm.org/D108362	2021-08-20 14:59:21 +02:00
Florian Hahn	f999312872	Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store." This reverts the revert 28c04794df74ad3c38155a244729d1f8d57b9400. The failing MLIR test that caused the revert should be fixed in this version. Also includes a PPC test fix previously in 1f87c7c478a6.	2021-08-12 18:31:57 +01:00
Mehdi Amini	28c04794df	Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store." This reverts commit a1ef81de35a4bac6d3b22e9d7186d880124d7a55. Broke the MLIR buildbot.	2021-08-12 11:57:19 +00:00
Florian Hahn	a1ef81de35	[Matrix] Overload stride arg in matrix.columnwise.load/store. This patch adjusts the intrinsics definition of llvm.matrix.column.major.load and llvm.matrix.column.major.store to allow overloading the type of the stride. The bitwidth of the stride is used to perform the offset computation. This fixes a crash when using __builtin_matrix_column_major_load or __builtin_matrix_column_major_store on 32 bit platforms. The stride argument of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit platforms. Note that we still perform offset computations with 64 bit width on 32 bit platforms for accesses that do not take a user-specified stride. This can be fixed separately. Fixes PR51304. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D107349	2021-08-12 10:45:25 +01:00
Adam Nemet	d87d3615f7	[Matrix] Fix shape for factored transpose The shape of the input is C x R. Differential Revision: https://reviews.llvm.org/D106722	2021-07-27 11:36:13 -07:00
Adam Nemet	bf7eb48454	[Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo As an instruction is replaced in optimizeTransposes RAUW will replace it in the ShapeMap (ShapeMap is ValueMap so that uses are updated). In finalizeLowering however we skip updating uses if they are in the ShapeMap since they will be lowered separately at which point we pick up the lowered operands. In the testcase what happened was that since we replaced the doubled-transpose with the shuffle, it ended up in the ShapeMap. As we lowered the columnwise-load the use in the shuffle was not updated. Then as we removed the original columnwise-load we changed that to an undef. I.e. we ended up with: ``` %shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32> ^^^^^ <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6> ``` Besides the fix itself, I have fortified this last bit. As we change uses to undef when removing instruction we track the undefed instruction to make sure we eventually remove those too. This would have caught the issue at compile time. Differential Revision: https://reviews.llvm.org/D106714	2021-07-27 11:36:13 -07:00

1 2

99 Commits