llvm-project

Author	SHA1	Message	Date
Sergei Grechanik	fd2b08969b	[mlir][Vector] Lowering of transfer_read/write to vector.load/store This patch introduces progressive lowering patterns for rewriting vector.transfer_read/write to vector.load/store and vector.broadcast in certain supported cases. Reviewed By: dcaballe, nicolasvasilache Differential Revision: https://reviews.llvm.org/D97822	2021-03-11 18:17:51 -08:00
Alexander Belyaev	a89035d750	Revert "[MLIR] Create memref dialect and move several dialect-specific ops from std." This commit introduced a cyclic dependency: Memref dialect depends on Standard because it used ConstantIndexOp. Std depends on the MemRef dialect in its EDSC/Intrinsics.h Working on a fix. This reverts commit 8aa6c3765b924d86f623d452777eb76b83bf2787.	2021-02-18 12:49:52 +01:00
Julian Gross	8aa6c3765b	[MLIR] Create memref dialect and move several dialect-specific ops from std. Create the memref dialect and move several dialect-specific ops without dependencies to other ops from std dialect to this dialect. Moved ops: AllocOp -> MemRef_AllocOp AllocaOp -> MemRef_AllocaOp DeallocOp -> MemRef_DeallocOp MemRefCastOp -> MemRef_CastOp GetGlobalMemRefOp -> MemRef_GetGlobalOp GlobalMemRefOp -> MemRef_GlobalOp PrefetchOp -> MemRef_PrefetchOp ReshapeOp -> MemRef_ReshapeOp StoreOp -> MemRef_StoreOp TransposeOp -> MemRef_TransposeOp ViewOp -> MemRef_ViewOp The roadmap to split the memref dialect from std is discussed here: https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667 Differential Revision: https://reviews.llvm.org/D96425	2021-02-18 11:29:39 +01:00
Thomas Raoux	397336dcab	[mlir][vector] Add missing support for contract of integer lowering. Some of the lowering of vector.contract didn't support integer case. Since reduction of integer cannot accumulate we always break up the reduction op, it should be merged by a separate canonicalization if possible. Differential Revision: https://reviews.llvm.org/D96461	2021-02-16 07:13:30 -08:00
Lei Zhang	cb1a42359b	[mlir][vector] Move splitting transfer ops into a separate entry point These patterns unrolls transfer read/write ops if the vector consumers/ producers are extract/insert slices op. Transfer ops can map to hardware load/store functionalities, where the vector size matters for bandwidth considerations. So these patterns should be collected separately, instead of being generic canonicalization patterns. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D96782	2021-02-16 10:04:34 -05:00
Praveen Narayanan	a65fb1916c	Add a "kind" attribute to ContractionOp and OuterProductOp. Currently, vector.contract joins the intermediate result and the accumulator argument (of ranks K) using summation. We desire more joining operations --- such as max --- to help vector.contract express reductions. This change extends Vector_ContractionOp to take an optional attribute (called "kind", of enum type CombiningKind) specifying the joining operation to be add/mul/min/max for int/fp , and and/or/xor for int only. By default this attribute has value "add". To implement this we also need to extend vector.outerproduct, since vector.contract gets transformed to vector.outerproduct (and that to vector.fma). The extension for vector.outerproduct is also an optional kind attribute that uses the same enum type and possible values. The default is "add". In case of max/min we transform vector.outerproduct to a combination of compare and select. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D93280	2021-02-12 20:23:59 +00:00
Uday Bondhugula	fdfd647837	[MLIR] NFC Fix vector transforms build warnings Fix build warnings from VectorTransforms.cpp.	2021-02-10 10:42:56 +05:30
Lei Zhang	7630520ae3	[mlir][vector] Add pattern to shuffle bitcast ops These patterns move vector.bitcast ops to be before insert ops or after extract ops where suitable. With them, bitcast will happen on smaller vectors and there are more chances to share extract/insert ops. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D96040	2021-02-05 17:52:49 -05:00
Lei Zhang	874ce9b80f	[mlir][vector] Add patterns to cast away leading 1-dim This patch adds patterns to use vector.shape_cast to cast away leading 1-dimensions from a few vector operations. It allows exposing more canonical forms of vector.transfer_read, vector.transfer_write, vector_extract_strided_slice, and vector.insert_strided_slice. With this, we can have more opportunity to cancelling extract/insert ops or forwarding write/read ops. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D95873	2021-02-05 09:02:15 -05:00
Nicolas Vasilache	05d5125d8a	[mlir] Generalize OpFoldResult usage in ops with offsets, sizes and operands. This revision starts evolving the APIs to manipulate ops with offsets, sizes and operands towards a ValueOrAttr abstraction that is already used in folding under the name OpFoldResult. The objective, in the future, is to allow such manipulations all the way to the level of ODS to avoid all the genuflexions involved in distinguishing between values and attributes for generic constant foldings. Once this evolution is accepted, the next step will be a mechanical OpFoldResult -> ValueOrAttr. Differential Revision: https://reviews.llvm.org/D95310	2021-01-25 14:17:03 +00:00
Thomas Raoux	f9190c8681	[mlir][vector] Support unrolling for transfer ops using tensors Differential Revision: https://reviews.llvm.org/D93904	2021-01-06 13:28:04 -08:00
Chris Lattner	9eb3e564d3	[ODS] Make the getType() method on a OneResult instruction return a specific type. Implement Bug 46698, making ODS synthesize a getType() method that returns a specific C++ class for OneResult methods where we know that class. This eliminates a common source of casts in things like: myOp.getType().cast<FIRRTLType>().getPassive() because we know that myOp always returns a FIRRTLType. This also encourages op authors to type their results more tightly (which is also good for verification). I chose to implement this by splitting the OneResult trait into itself plus a OneTypedResult trait, given that many things are using `hasTrait<OneResult>` to conditionalize various logic. While this changes makes many many ops get more specific getType() results, it is generally drop-in compatible with the previous behavior because 'x.cast<T>()' is allowed when x is already known to be a T. The one exception to this is that we need declarations of the types used by ops, which is why a couple headers needed additional #includes. I updated a few things in tree to remove the now-redundant `.cast<>`'s, but there are probably many more than can be removed. Differential Revision: https://reviews.llvm.org/D93790	2020-12-26 13:52:40 -08:00
Thomas Raoux	7c7b55b985	[mlir][vector] Extend vector unroll to all element-wise ops Extend unroll to support all element-wise ops and allow unrolling for ops with vector operands of with the same shape as the destination but different element type (like Cmp or Select). Differential Revision: https://reviews.llvm.org/D93121	2020-12-21 13:31:22 -08:00
Thomas Raoux	26c8f9081b	[mlir[[vector] Extend Transfer read/write ops to support tensor types. Transfer_ops can now work on both buffers and tensor. Right now, lowering of the tensor case is not supported yet. Differential Revision: https://reviews.llvm.org/D93500	2020-12-21 08:55:04 -08:00
River Riddle	1b97cdf885	[mlir][IR][NFC] Move context/location parameters of builtin Type::get methods to the start of the parameter list This better matches the rest of the infrastructure, is much simpler, and makes it easier to move these types to being declaratively specified. Differential Revision: https://reviews.llvm.org/D93432	2020-12-17 13:01:36 -08:00
Christian Sigg	1ffc1aaa09	[mlir] Use mlir::OpState::operator->() to get to methods of mlir::Operation. This is a preparation step to remove those methods from OpState. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D93098	2020-12-13 09:58:16 +01:00
Christian Sigg	0bf4a82a5a	[mlir] Use mlir::OpState::operator->() to get to methods of mlir::Operation. This is a preparation step to remove the corresponding methods from OpState. Reviewed By: silvas, rriddle Differential Revision: https://reviews.llvm.org/D92878	2020-12-09 12:11:32 +01:00
Christian Sigg	c4a0405902	Add `Operation* OpState::operator->()` to provide more convenient access to members of Operation. Given that OpState already implicit converts to Operator*, this seems reasonable. The alternative would be to add more functions to OpState which forward to Operation. Reviewed By: rriddle, ftynse Differential Revision: https://reviews.llvm.org/D92266	2020-12-02 15:46:20 +01:00
River Riddle	65fcddff24	[mlir][BuiltinDialect] Resolve comments from D91571 * Move ops to a BuiltinOps.h * Add file comments	2020-11-19 11:12:49 -08:00
River Riddle	73ca690df8	[mlir][NFC] Remove references to Module.h and Function.h These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp. Differential Revision: https://reviews.llvm.org/D91572	2020-11-17 00:55:47 -08:00
Aart Bik	9ddb464d37	[mlir] refactor common idiom into AffineMap method motivated by a refactoring in the new sparse code (yet to be merged), this avoids some lengthy code dup Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D91465	2020-11-13 19:18:13 -08:00
Thomas Raoux	6ad31c0f4a	[mlir][vector] Support N-D vector in InsertMap/ExtractMap op Support multi-dimension vector for InsertMap/ExtractMap op and update the transformations. Currently the relation between IDs and dimension is implicitly deduced from the types. We can then calculate an AffineMap based on it. In the future the AffineMap could be part of the operation itself. Differential Revision: https://reviews.llvm.org/D90995	2020-11-13 12:40:17 -08:00
Thomas Raoux	5d45f758f0	[mlir][vector] Improve vector distribute integration test and fix block distribution Fix semantic in the distribute integration test based on offline feedback. This exposed a bug in block distribution, we need to make sure the id is multiplied by the stride of the vector. Fix the transformation and unit test. Differential Revision: https://reviews.llvm.org/D89291	2020-10-29 14:54:53 -07:00
Kazuaki Ishizaki	41b09f4eff	[mlir] NFC: fix trivial typos fix typos in comments and documents Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D90089	2020-10-29 04:05:22 +09:00
Thomas Raoux	bd07be4f3f	[mlir][vector] Update doc strings for insert_map/extract_map and fix insert_map semantic Based on discourse discussion, fix the doc string and remove examples with wrong semantic. Also fix insert_map semantic by adding missing operand for vector we are inserting into. Differential Revision: https://reviews.llvm.org/D89563	2020-10-26 10:47:01 -07:00
Thomas Raoux	edbdea7466	[mlir][vector] Add unrolling patterns for Transfer read/write Adding unroll support for transfer read and transfer write operation. This allows to pick the ideal size for the memory access for a given target. Differential Revision: https://reviews.llvm.org/D89289	2020-10-15 15:17:36 -07:00
Thomas Raoux	cf402a1987	[mlir][vector] Add unit test for vector distribute by block When distributing a vector larger than the given multiplicity, we can distribute it by block where each id gets a chunk of consecutive element along the dimension distributed. This adds a test for this case and adds extra checks to make sure we don't distribute for cases not multiple of multiplicity. Differential Revision: https://reviews.llvm.org/D89061	2020-10-08 14:44:03 -07:00
Thomas Raoux	d1c8e179d8	[mlir][vector] Add canonicalization patterns for extractMap/insertMap Add basic canonicalization patterns for the extractMap/insertMap to allow them to be folded into Transfer ops. Also mark transferRead as memory read so that it can be removed by dead code. Differential Revision: https://reviews.llvm.org/D88622	2020-10-02 10:13:11 -07:00
Thomas Raoux	dd14e58252	[mlir][vector] First step of vector distribution transformation This is the first of several steps to support distributing large vectors. This adds instructions extract_map and insert_map that allow us to do incremental lowering. Right now the transformation only apply to simple pointwise operation with a vector size matching the multiplicity of the IDs used to distribute the vector. This can be used to distribute large vectors to loops or SPMD. Differential Revision: https://reviews.llvm.org/D88341	2020-09-30 13:14:55 -07:00
Jakub Lichman	14088a6f5d	[mlir] Added support for rank reducing subviews This commit adds support for subviews which enable to reduce resulting rank by dropping static dimensions of size 1. Differential Revision: https://reviews.llvm.org/D88534	2020-09-30 11:15:18 +00:00
aartbik	060c9dd1cc	[mlir] [VectorOps] Improve SIMD compares with narrower indices When allowed, use 32-bit indices rather than 64-bit indices in the SIMD computation of masks. This runs up to 2x and 4x faster on a number of AVX2 and AVX512 microbenchmarks. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D87116	2020-09-03 21:43:38 -07:00
Nicolas Vasilache	3f906c54a2	[mlir][Vector] Add 2-D vector contract lowering to ReduceOp This new pattern mixes vector.transpose and direct lowering to vector.reduce. This allows more progressive lowering than immediately going to insert/extract and composes more nicely with other canonicalizations. This has 2 use cases: 1. for very wide vectors the generated IR may be much smaller 2. when we have a custom lowering for transpose ops we can target it directly rather than rely LLVM Differential Revision: https://reviews.llvm.org/D85428	2020-08-07 06:17:48 -04:00
Nicolas Vasilache	1353cbc257	[mlir][Vector] NFC - Use matchAndRewrite in ContractionOp lowering patterns Replace the use of separate match and rewrite which unnecessarily duplicates logic. Differential Revision: https://reviews.llvm.org/D85421	2020-08-06 09:02:25 -04:00
Nicolas Vasilache	2d0b05969b	[mlir][Vector] Relax condition for `splitFullAndPartialTransferPrecondition` The `splitFullAndPartialTransferPrecondition` has a restrictive condition to prevent the pattern to be applied recursively if it is nested under an scf.IfOp. Relaxing the condition to the immediate parent op must not be an scf.IfOp lets the pattern be applied more generally while still preventing recursion. Differential Revision: https://reviews.llvm.org/D85209	2020-08-04 10:06:21 -04:00
Nicolas Vasilache	1a4263d394	[mlir][Vector] Add linalg.copy-based pattern for splitting vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = linalg.fill(%extra_alloc, %pad) %3 = subview %view [...][...][...] linalg.copy(%3, %alloc) memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles.	2020-08-04 08:46:08 -04:00
Nicolas Vasilache	d313e9c12e	[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...> %3 = vector.type_cast %extra_alloc : memref<...> to memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 = memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles. Differential Revision: https://reviews.llvm.org/D84631	2020-08-03 12:58:18 -04:00
Mehdi Amini	7ba82a7320	Revert "[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies." This reverts commit 35b65be041127db9fe23d3128a004c888893cbae. Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined references like: VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'	2020-08-03 16:16:47 +00:00
Nicolas Vasilache	35b65be041	[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies. This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling: ``` %1:3 = scf.if (%inBounds) { scf.yield %view : memref<A...>, index, index } else { %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...> %3 = vector.type_cast %extra_alloc : memref<...> to memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 = memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 : memref<A...>, index, index } %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]} ``` where `extra_alloc` is a top of the function alloca'ed buffer of one vector. This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer. The extra work only occurs on the boundary tiles. Differential Revision: https://reviews.llvm.org/D84631	2020-08-03 04:53:43 -04:00
Benjamin Kramer	eb41f9edde	[mlir][Vector] Simplify code a bit. NFCI.	2020-08-01 14:49:19 +02:00
Nicolas Vasilache	47cbd9f922	[mlir][Vector] NFC - Improve VectorInterfaces This revision improves and makes better use of OpInterfaces for the Vector dialect. Differential Revision: https://reviews.llvm.org/D84053	2020-07-20 08:24:22 -04:00
Pierre Oechsel	ec62e37c86	[mlir] [vector] Add an optional filter to vector contract lowering patterns. Summary: Vector contract patterns were only parameterized by a `vectorTransformsOptions`. As a result, even if an mlir file was containing several occurrences of `vector.contract`, all of them would be lowered in the same way. More granularity might be required . This Diff adds a `constraint` argument to each of these patterns which allows the user to specify with more precision on which `vector.contract` should each of the lowering apply. Differential Revision: https://reviews.llvm.org/D83960	2020-07-17 12:03:13 -04:00
aartbik	365434a584	[mlir] [VectorOps] Merge OUTER/AXPY vector.contract lowering into single case We temporarily had separate OUTER lowering (for matmat flavors) and AXPY lowering (for matvec flavors). With the new generalized "vector.outerproduct" semantics, these cases can be merged into a single lowering method. This refactoring will simplify future decisions on cost models and lowering heuristics. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D83585	2020-07-10 13:11:54 -07:00
aartbik	9bf6354301	[mlir] [VectorOps] Allow AXPY to be expressed as special case of OUTERPRODUCT This specialization allows sharing more code where an AXPY follows naturally in cases where an OUTERPRODUCT on a scalar would be generated. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D83453	2020-07-10 12:23:24 -07:00
Benjamin Kramer	cca4ac523e	[mlir][VectorOps] Lower vector.outerproduct of int vectors vector.fma and mulf don't work on integers. Use a muli/addi pair or plain muli instead. Differential Revision: https://reviews.llvm.org/D83292	2020-07-07 14:40:07 +02:00
River Riddle	9db53a1827	[mlir][NFC] Remove usernames and google bug numbers from TODO comments. These were largely leftover from when MLIR was a google project, and don't really follow LLVM guidelines.	2020-07-07 01:40:52 -07:00
Nicolas Vasilache	05c65dc0fe	[mlir][Vector] Add a VectorUnrollInterface and expose UnrollVectorPattern. The UnrollVectorPattern is can be used in a programmable fashion by: ``` OwningRewritePatternList patterns; patterns.insert<UnrollVectorPattern<AddFOp>>(ArrayRef<int64_t>{2, 2}, ctx); patterns.insert<UnrollVectorPattern<vector::ContractionOp>>( ArrayRef<int64_t>{2, 2, 2}, ctx); ... applyPatternsAndFoldGreedily(getFunction(), patterns); ``` Differential revision: https://reviews.llvm.org/D83064	2020-07-06 08:09:06 -04:00
aartbik	ee01c7a740	[mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract Default vector.contract lowering essentially yields a series of sdot/ddot operations. However, for some layouts a series of saxpy/daxpy operations, chained through fma are more efficient. This CL introduces a choice between the two lowering paths. A default heuristic is to follow. Some preliminary avx2 performance numbers for matrix-times-vector. Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b. ``` ------------------------------------------------------------ A x b A^T x b ------------------------------------------------------------ GFLOPS sdot (reassoc) saxpy sdot (reassoc) saxpy ------------------------------------------------------------ 1x1 0.6 0.9 0.6 0.9 2x2 2.5 3.2 2.4 3.5 4x4 6.4 8.4 4.9 11.8 8x8 11.7 6.1 5.0 29.6 16x16 20.7 10.8 7.3 43.3 32x32 29.3 7.9 6.4 51.8 64x64 38.9 79.3 128x128 32.4 40.7 ------------------------------------------------------------ ``` Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D83012	2020-07-02 13:21:17 -07:00
aartbik	63b3933d0c	[mlir] [VectorOps] Replace zero fma with mult for vector.contract More efficient implementation of the multiply-reduce pair, no need to add in a zero vector. Microbenchmarking on AVX2 yields the following difference in vector.contract speedup (over strict-order scalar reduction). SPEEDUP SIMD-fma SIMD-mul 4x4 1.45 2.00 8x8 1.40 1.90 32x32 5.32 5.80 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D82833	2020-06-30 09:04:20 -07:00
aartbik	55d09dfc7b	[mlir] [VectorOps] Improve vector.create_mask lowering Use vector compares for the 1-D case. This approach scales much better than generating insertion operations, and exposes SIMD directly to backend. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D82402	2020-06-23 14:33:41 -07:00
Thomas Raoux	e4bc08f012	[mlir] Allow vector.contract to have mixed types operands Allow lhs and rhs to have different type than accumulator/destination. Some hardware like GPUs support natively operations like uint8xuint8xuint32. Differential Revision: https://reviews.llvm.org/D82069	2020-06-19 17:08:57 -07:00

1 2

77 Commits