32 Commits

Author SHA1 Message Date
Nicolas Vasilache
05c65dc0fe [mlir][Vector] Add a VectorUnrollInterface and expose UnrollVectorPattern.
The UnrollVectorPattern is can be used in a programmable fashion by:
```
OwningRewritePatternList patterns;
    patterns.insert<UnrollVectorPattern<AddFOp>>(ArrayRef<int64_t>{2, 2}, ctx);
    patterns.insert<UnrollVectorPattern<vector::ContractionOp>>(
        ArrayRef<int64_t>{2, 2, 2}, ctx);
    ...
    applyPatternsAndFoldGreedily(getFunction(), patterns);
```

Differential revision: https://reviews.llvm.org/D83064
2020-07-06 08:09:06 -04:00
aartbik
ee01c7a740 [mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract
Default vector.contract lowering essentially yields a series of sdot/ddot
operations. However, for some layouts a series of saxpy/daxpy operations,
chained through fma are more efficient. This CL introduces a choice between
the two lowering paths. A default heuristic is to follow.

Some preliminary avx2 performance numbers for matrix-times-vector.
Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b.

```
------------------------------------------------------------
            A x b                          A^T x b
------------------------------------------------------------
GFLOPS    sdot (reassoc)    saxpy    sdot (reassoc)    saxpy
------------------------------------------------------------
1x1        0.6               0.9       0.6             0.9
2x2        2.5               3.2       2.4             3.5
4x4        6.4               8.4       4.9             11.8
8x8       11.7               6.1       5.0             29.6
16x16     20.7              10.8       7.3             43.3
32x32     29.3               7.9       6.4             51.8
64x64     38.9                                         79.3
128x128   32.4                                         40.7
------------------------------------------------------------
```

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D83012
2020-07-02 13:21:17 -07:00
aartbik
63b3933d0c [mlir] [VectorOps] Replace zero fma with mult for vector.contract
More efficient implementation of the multiply-reduce pair,
no need to add in a zero vector. Microbenchmarking on AVX2
yields the following difference in vector.contract speedup
(over strict-order scalar reduction).

SPEEDUP     SIMD-fma SIMD-mul
4x4	    1.45 	 2.00
8x8	    1.40 	 1.90
32x32    	5.32 	 5.80

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D82833
2020-06-30 09:04:20 -07:00
aartbik
55d09dfc7b [mlir] [VectorOps] Improve vector.create_mask lowering
Use vector compares for the 1-D case. This approach scales much better
than generating insertion operations, and exposes SIMD directly to backend.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D82402
2020-06-23 14:33:41 -07:00
Thomas Raoux
e4bc08f012 [mlir] Allow vector.contract to have mixed types operands
Allow lhs and rhs to have different type than accumulator/destination. Some
hardware like GPUs support natively operations like uint8xuint8xuint32.

Differential Revision: https://reviews.llvm.org/D82069
2020-06-19 17:08:57 -07:00
aartbik
0d82ab7885 [mlir] [VectorOps] Improve vector.constant_mask lowering
Use direct vector constants for the 1-D case. This approach
scales much better than generating elaborate insertion operations
that are eventually folded into a constant. We could of course
generalize the 1-D case to higher ranks, but this simplification
already helps in scaling some microbenchmarks that would formerly
crash on the intermediate IR length.

Reviewed By: reidtatge

Differential Revision: https://reviews.llvm.org/D82144
2020-06-19 10:40:08 -07:00
aartbik
1e45b55dcc [mlir] [VectorOps] Handle 'vector.shape_cast' lowering for all cases
Summary:
Even though this operation is intended for 1d/2d conversions currently,
leaving a semantic hole in the lowering prohibits proper testing of this
operation. This CL adds a straightforward reference implementation for the
missing cases.

Reviewers: nicolasvasilache, mehdi_amini, ftynse, reidtatge

Reviewed By: reidtatge

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D81503
2020-06-09 16:08:45 -07:00
aartbik
c19fae507e [mlir] [VectorOps] Add missing comments to CreateMaskOp lowering
Summary: Add missing comment to CreateMask. Fixed typo in ConstantMask comment.

Reviewers: nicolasvasilache, rriddle, reidtatge, ftynse

Reviewed By: ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D81125
2020-06-04 12:50:47 -07:00
aartbik
6391da98f4 [mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'
Summary:
Progressive lowering of vector.transpose into an operation that
is closer to an intrinsic, and thus the hardware ISA. Currently
under the common vector transform testing flag, as we prepare
deploying this transformation in the LLVM lowering pipeline.

Reviewers: nicolasvasilache, reidtatge, andydavis1, ftynse

Reviewed By: nicolasvasilache, ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm, #mlir

Differential Revision: https://reviews.llvm.org/D80772
2020-06-03 14:55:50 -07:00
Nicolas Vasilache
ba10daa820 [mlir][Vector] Add more vector.contract -> outerproduct lowerings and fix vector.contract type inference.
This revision expands the types of vector contractions that can be lowered to vector.outerproduct.
All 8 permutation cases are support.
The idiomatic manipulation of AffineMap written declaratively makes this straightforward.

In the process a bug with the vector.contract verifier was uncovered.
The vector shape verification part of the contract op is rewritten to use AffineMap composition.
One bug in the vector `ops.mlir` test is fixed and a new case not yet captured is added
to the vector`invalid.mlir` test.

Differential Revision: https://reviews.llvm.org/D80393
2020-05-26 15:40:55 -04:00
Nicolas Vasilache
9578a54f50 [mlir][Vector] Add vector contraction to outerproduct lowering
This revision adds the additional lowering and exposes the patterns at a finer granularity for better programmatic reuse. The unit test makes use of the finer grained pattern for simpler checks.

As the ContractionOpLowering is exposed programmatically, cleanup opportunities appear and static class methods are turned into free functions with static visibility.

Differential Revision: https://reviews.llvm.org/D80375
2020-05-26 09:31:26 -04:00
Nicolas Vasilache
7c3c5b11b1 [mlir][Vector] Add option to fully unroll for VectorTransfer to SCF lowering
Summary:
Previously, the only support partial lowering from vector transfers to SCF was
going through loops. This requires a dedicated allocation and extra memory
roundtrips because LLVM aggregates cannot be indexed dynamically (for more
details see the [deep-dive](https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive)).

This revision allows specifying full unrolling which removes this additional roundtrip.
This should be used carefully though because full unrolling will spill, negating the
benefits of removing the interim alloc in the first place.

Proper heuristics are left for a later time.

Differential Revision: https://reviews.llvm.org/D80100
2020-05-20 11:02:13 -04:00
Nicolas Vasilache
1870e787af [mlir][Vector] Add an optional "masked" boolean array attribute to vector transfer operations
Summary:
Vector transfer ops semantic is extended to allow specifying a per-dimension `masked`
attribute. When the attribute is false on a particular dimension, lowering to LLVM emits
unmasked load and store operations.

Differential Revision: https://reviews.llvm.org/D80098
2020-05-18 11:52:08 -04:00
Nicolas Vasilache
1d6eb09d22 [mlir] NFC - VectorTransforms use OpBuilder where relevant
Summary: This will allow using unrolling outside of only rewrite patterns.

Differential Revision: https://reviews.llvm.org/D80083
2020-05-17 10:17:12 -04:00
aartbik
b1c688dbae [mlir] [VectorOps] Implement vector.create_mask lowering to LLVM IR
Summary:
First, compact implementation of lowering to LLVM IR. A bit more
challenging than the constant mask due to the dynamic indices, of course.
I like to hear if there are more efficient ways of doing this in LLVM,
but this for now at least gives us a functional reference implementation.

Reviewers: nicolasvasilache, ftynse, bkramer, reidtatge, andydavis1, mehdi_amini

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79954
2020-05-15 11:02:30 -07:00
Alex Zinenko
4ead2cf76c [mlir] Rename conversions involving ex-Loop dialect to mention SCF
The following Conversions are affected: LoopToStandard -> SCFToStandard,
LoopsToGPU -> SCFToGPU, VectorToLoops -> VectorToSCF. Full file paths are
affected. Additionally, drop the 'Convert' prefix from filenames living under
lib/Conversion where applicable.

API names and CLI options for pass testing are also renamed when applicable. In
particular, LoopsToGPU contains several passes that apply to different kinds of
loops (`for` or `parallel`), for which the original names are preserved.

Differential Revision: https://reviews.llvm.org/D79940
2020-05-15 10:45:11 +02:00
aartbik
fb2c4d50f1 [mlir] [VectorOps] Implement vector.constant_mask lowering to LLVM IR
Summary:
Makes this operation runnable on CPU by generating MLIR instructions
that are eventually folded into an LLVM IR constant for the mask.

Reviewers: nicolasvasilache, ftynse, reidtatge, bkramer, andydavis1

Reviewed By: nicolasvasilache, ftynse, andydavis1

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79815
2020-05-12 19:44:23 -07:00
aartbik
40f56c8cf1 [mlir] [VectorOps] Replace zero-scalar + splat into direct zero vector constant
Summary:
The scalar zero + splat yields more intermediate code than the direct
dense zero constant, and ultimately is lowered to exactly the same
LLVM IR operations, so no point wasting the intermediate code.

Reviewers: nicolasvasilache, andydavis1, reidtatge

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79758
2020-05-11 20:20:37 -07:00
Reid Tatge
334a4159ec [mlir][Vector] NFC - Rename vector.strided_slice into vector.extract_strided_slice
Differential Revision: https://reviews.llvm.org/D79734
2020-05-11 14:21:10 -07:00
aartbik
186709c6e0 [mlir] [VectorOps] Progressive lowering of vector.broadcast
Summary:
Rather than having a full, recursive, lowering of vector.broadcast
to LLVM IR, it is much more elegant to have a progressive lowering
of each vector.broadcast into a lower dimensional vector.broadcast,
until only elementary vector operations remain. This results
in more elegant, step-wise code, that is easier to understand.
Also makes some optimizations in the generated code.

Reviewers: nicolasvasilache, mehdi_amini, andydavis1, grosul1

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78071
2020-04-16 21:02:27 -07:00
Jeremy Bruestle
9f3ab92ec8 [MLIR] Improve support for 0-dimensional Affine Maps.
Summary:
Modified AffineMap::get to remove support for the overload which allowed
an ArrayRef of AffineExpr but no context (and gathered the context from a
presumed first entry, resulting in bugs when there were 0 results).

Instead, we support only a ArrayRef and a context, and a version which
takes a single AffineExpr.

Additionally, removed some now needless case logic which previously
special cased which call to AffineMap::get to use.

Reviewers: flaub, bondhugula, rriddle!, nicolasvasilache, ftynse, ulysseB, mravishankar, antiagainst, aartbik

Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, bader, grosul1, frgossen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78226
2020-04-15 14:15:02 -07:00
River Riddle
92f1562f3d [mlir][NFC] Remove the STLExtras.h header file now that it has been merged into LLVM.
Now that no more utilities exist within, this file can be deleted.

Differential Revision: https://reviews.llvm.org/D78079
2020-04-14 15:14:41 -07:00
River Riddle
d3588d0814 [mlir][NFC] Replace mlir/Support/Functional.h with llvm equivalents.
Summary: Functional.h contains many different methods that have a direct, and more efficient, equivalent in LLVM. This revision replaces all usages with the LLVM equivalent, and removes the header. This is part of larger cleanup, pr45513, merging MLIR support facilities into LLVM.

Differential Revision: https://reviews.llvm.org/D78053
2020-04-13 14:22:12 -07:00
Nicolas Vasilache
2d32ee0d7a [mlir][Vector] Update lowering of vector ops to llvm intrinsics to use row-major.
Summary:
LLVM matrix intrinsics recently introduced an option to support row-major mode.
This matches the MLIR vector model, this revision switches to row-major.

A corner case related to degenerate sizes was also fixed upstream.
This revision removes the guard against this corner case.

A bug was uncovered on the output vector construction which this revision also fixes.

Lastly, this has been tested on a small size and benchmarked independently: no visible performance regression is observed.

In the future, when matrix intrinsics support per op attribute, we can more aggressively translate to that and avoid inserting MLIR-level transposes.

This has been tested independently to work on small matrices.

Differential Revision: https://reviews.llvm.org/D77761
2020-04-09 16:37:28 -04:00
Andy Davis
7006daa548 [MLIR][Vector] Update ShapeCastOp folder to use producer-consumer value forwarding.
Summary:
Update ShapeCastOp folder to use producer-consumer value forwarding.
Support is added for tracking sub-vectors through trivial shape cast operations,
where the sub-vector shape is preserved across shape cast operations and only
leading ones are added or removed.
Support is preserved for cancelling shape cast operations.
One unit test is added and two are updated.

Reviewers: aartbik, nicolasvasilache

Reviewed By: aartbik, nicolasvasilache

Subscribers: frgossen, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77253
2020-04-08 08:55:37 -07:00
Andy Davis
31a346cc35 [MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
Summary:
Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
Vector-to-vector transformations for unrolling and lowering to hardware vectors
can generate chains of structured vector operations (InsertSlicesOp,
ExtractSlicesOp and ShapeCastOp) between the producer of a hardware vector
value and its consumer. Because InsertSlicesOp, ExtractSlicesOp and ShapeCastOp
are structured, we can track the location (tuple index and vector offsets) of
the consumer vector value through the chain of structured operations to the
producer, enabling a much more powerful producer-consumer fowarding of values
through structured ops and tuple, which in turn enables a more powerful
TupleGetOp folding transformation.

Reviewers: nicolasvasilache, aartbik

Reviewed By: aartbik

Subscribers: grosul1, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76889
2020-03-31 08:39:17 -07:00
Kazuaki Ishizaki
e5a8512655 [mlir] NFC: fix trivial typo in source files
Summary: fix trivial typos in the source files

Reviewers: mravishankar, antiagainst, nicolasvasilache, herhut, rriddle, aartbik

Reviewed By: antiagainst, rriddle

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, bader, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76876
2020-03-28 10:12:49 +09:00
aartbik
8d46bfa808 [mlir] [VectorOps] A "reference" lowering of vector.transpose to LLVM IR
Summary: Makes the vector.tranpose runnable on CPU.

Reviewers: nicolasvasilache, andydavis1, rriddle

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76644
2020-03-23 19:01:38 -07:00
Rob Suderman
e708471395 [mlir][NFC] Cleanup AffineOps directory structure
Summary:
Change AffineOps Dialect structure to better group both IR and Tranforms. This included extracting transforms directly related to AffineOps. Also move AffineOps to Affine.

Differential Revision: https://reviews.llvm.org/D76161
2020-03-20 14:23:43 -07:00
River Riddle
3145427dd7 [mlir][NFC] Replace all usages of PatternMatchResult with LogicalResult
This also replaces usages of matchSuccess/matchFailure with success/failure respectively.

Differential Revision: https://reviews.llvm.org/D76313
2020-03-17 20:21:32 -07:00
Nicolas Vasilache
2fae7878d5 [mlir][Vector] Mostly-NFC - Restructure options for lowering to LLVM Matrix Intrinsics
Summary:
This revision restructures the calling of vector transforms to make it more flexible to ask for lowering through LLVM matrix intrinsics.
This also makes sure we bail out in degenerate cases (i.e. 1) in which LLVM complains about not being able to scalarize.

Differential Revision: https://reviews.llvm.org/D76266
2020-03-17 22:58:02 -04:00
Rob Suderman
4d60f47b08 [mlir][NFC] Renamed VectorOps to Vector
Summary: Renamed VectorOps to Vector to avoid the redundant Ops suffix.

Differential Revision: https://reviews.llvm.org/D76317
2020-03-17 15:28:08 -07:00