117 Commits

Author SHA1 Message Date
Kazu Hirata
22b0f7ba6e [Transforms] Include llvm/ADT/SmallSet.h (NFC)
This patch adds #include "llvm/ADT/SmallSet.h" to a couple of files
that are relying on transitive includes of SmallSet.h.  It in turn
unblocks the removal of unnecessary includes of llvm/ADT/SmallSet.h in
several other files.
2023-11-11 12:25:39 -08:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Björn Pettersson
21c251aaca
[LowerMatrixIntrinsics] Drop support for typed pointers (#65605) 2023-09-08 18:06:09 +02:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Bjorn Pettersson
91157a0b26 [LegacyPM] Drop unused includes in passes no longer supporting legacy PM 2023-08-13 16:46:57 +02:00
Nuno Lopes
23c2175945 [LowerMatrixIntrinsics] Use poison instead of undef as placeholder [NFC]
These values don't propagate to the output; they are always replaced with a subsequent shuffle
or insertelement.
Tested equivalence with Alive2, e.g., https://alive2.llvm.org/ce/z/fj4s78.
2023-07-18 09:54:41 +01:00
Youngsuk Kim
f69b9b7cce [llvm] Remove uses of Type::getPointerTo() (NFC)
Partial progress towards removing in-tree uses of `getPointerTo()`,
by employing the following options:

* Drop the call entirely if the sole purpose of it is to support
  a no-op bitcast (remove the no-op bitcast as well).

* Replace with `PointerType::get()`/`PointerType::getUnqual()`.

Also, remove no-op function `EmitBitCastOfLValueToProperType()`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154392
2023-07-08 13:05:58 -04:00
Kazu Hirata
2764322912 [LegacyPM] Remove LowerMatrixIntrinsicsLegacyPass and LowerMatrixIntrinsicsMinimalLegacyPass
Differential Revision: https://reviews.llvm.org/D153615
2023-06-23 01:32:38 -07:00
Florian Hahn
c10a7772bd
[Matrix] Convert binop operand of dot product to a row vector.
The dot product lowering will use the left operand as row vector.
If the operand is a binary op, convert it to operate on a row vector
instead of a column vector.

Depends on D148428.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D148429
2023-06-07 20:45:08 +01:00
Florian Hahn
ebbcbb2af5
[Matrix] Remove redundant transpose with dot product lowering.
Extend dot-product handling to skip transposes of the first operand. As
this is a vector, the conversion between column and row vector via the
transpose isn't needed.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D148428
2023-05-14 22:07:38 +01:00
Florian Hahn
0e8717f711
[Matrix] Add shape verification.
At the moment, lower-matrix-intrinsics accepts mis-matches between
shapes for operations. See shape-verification.ll for an example where
@llvm.matrix.column.major.load specifies 6x1 and then the use
(@llvm.matrix.multiply) specifies the operand to have 1x6.

This patch adds verification for shapes to check if shapes match.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D147438
2023-05-13 09:41:27 +01:00
Florian Hahn
f10153fe91
[Matrix] Handle integer types when distributing transposes across adds.
The current code did not properly account for integer matrixes. Check
if the operands are floating point or integer matrixes and use FAdd/Add
accordingly.

This is already done for other cases, like multiplies.

Fixes #62281.
2023-04-21 16:35:11 +01:00
Florian Hahn
98e50881e9
[Matrix] Refine cost estimate for dot-product.
Adjust lowerDotProduct cost estimate to include the cost benefits of:
 * emitting a wide load
 * emitting a wide multiply.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D147330
2023-04-14 11:35:01 +01:00
Florian Hahn
e6ab86a887
[Matrix] Fix IsSupported check in lowerDotProduct.
The check incorrectly checks the RHS while LHS is transformed later.
Update to check LHS, which fixes a crash in the newly added test cases.
2023-04-13 19:00:30 +01:00
Florian Hahn
78148eba49
[Matrix] Fix crash during dot product lowering.
Perform dot-product lowering before instruction fusion to avoid crash in
newly added test. Also update lowerDotProduct to properly mark optimized
matmul as fused.
2023-04-12 15:08:39 +01:00
Florian Hahn
04681243b4
[Matrix] Limit dot lowering to column major matrixes.
Limit to dot product lowering to column major matrixes for now. This
simplifies the code and reasoning for upcoming planned improvements.
Support for row-major matrixes can be added later as extension.
2023-04-05 15:49:06 +01:00
Kazu Hirata
52dd9deb15 [Scalar] Use SmallPtrSet::contains (NFC) 2023-03-31 23:50:17 -07:00
Vir Narula
e7281c6f61
[Matrix] Add special case dot product lowering
Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many `shufflevector` instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions.

Reviewed By: fhahn, thegameg

Differential Revision: https://reviews.llvm.org/D131125
2023-03-31 12:40:20 +01:00
Liren Peng
529ee9750b [NFC] Use single quotes for single char output during printPipline
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144365
2023-02-22 02:35:13 +00:00
Francis Visoiu Mistrih
da09b35334
[Matrix] Optimize matrix transposes around additions
First, sink the transposes to the operands to simplify redudant
ones. Then, lift them to reduce the number of realized transposes.

```
(A + B)^T -> A^T + B^T -> (A + B)^T
```

See tests for more examples.

Differential Revision: https://reviews.llvm.org/D133657
2023-01-11 15:21:59 -08:00
Guillaume Chatelet
8fd5558b29 [NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:49:38 +00:00
serge-sans-paille
16544cbe64 [iwyu] Move <cmath> out of llvm/Support/MathExtras.h
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.

No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
2022-09-28 20:49:01 +02:00
Francis Visoiu Mistrih
c5b10f348e [Matrix] Use print instead of dump for matrix-print-after-transpose-opt
We should be able to use this option even if LLVM_ENABLE_DUMP is not on.

(should fix the bots too)
2022-09-02 16:12:21 -07:00
Francis Visoiu Mistrih
81bdb4068d [Matrix] Simplify matmuls with scalars
If one of the operands is a transposed splat, the transpose can be
removed.

This is useful to simplify when transposes are distributed to operands
of a matmul:

* k^T -> k
* (A * k)^t -> A^t * k

Differential Revision: https://reviews.llvm.org/D130177
2022-09-02 15:50:25 -07:00
Kazu Hirata
50724716cd [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-14 12:51:58 -07:00
Francis Visoiu Mistrih
bfd3883e83 [Matrix] Refactor transpose distribution. NFC
Use a function to distribute transposes. Preparation for future patches.
2022-07-28 17:30:00 -07:00
Francis Visoiu Mistrih
448a094d3e [Matrix] Add assert to catch extracted vectors with poison elements
Assert when the extracted vector is wider than the row/column.

Differential Revision: https://reviews.llvm.org/D130173
2022-07-26 11:07:02 -07:00
Francis Visoiu Mistrih
2c6e8b4636 [Matrix] Refactor tiled loops in a struct. NFC
The three loops have the same structure: index, header, latch.
2022-07-26 11:02:22 -07:00
Nuno Lopes
022bd92c78 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-03 12:32:19 +01:00
Nuno Lopes
7c4f45f87a Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]
This reverts commits 47e6f98f84ac3 and 3e701bcd2a6aee2
2022-07-01 23:53:41 +01:00
Nuno Lopes
47e6f98f84 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-01 23:31:31 +01:00
Florian Hahn
7c0089d735
[Matrix] Check if iterator is at beginning of BB in optimizeTranspose.
If an instruction at the beginning of a block is erased,  this may
trigger crash due to dereferencing an invalid iterator.

Check if II is at the end before dereferencing it.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D127736
2022-06-14 21:37:02 +01:00
Kazu Hirata
8daf23d364 [Scalar] Use llvm::make_early_inc_range (NFC) 2022-06-05 23:53:18 -07:00
serge-sans-paille
59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
serge-sans-paille
a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Nikita Popov
cdc0573f75 [MatrixBuilder] Remove unnecessary IRBuilder template (NFC)
IRBuilderBase exists specifically to avoid the need for this.
2022-02-07 16:42:38 +01:00
Florian Hahn
b339bbdb19
[Matrix] Use ArrayType for allocas instead of VectorType.
When creating an alloca to copy a matrix due to memory conflicts, those
allocas used to use VectorTypes, which forced them to have huge
alignments for large vectors.

This patch updates LowerMatrixIntrinsics to use a corresponding array
type, like Clang already does, to get more manageable alignments.

Reviewed By: anemet, thegameg

Differential Revision: https://reviews.llvm.org/D118239
2022-01-28 10:47:52 +00:00
Craig Topper
38b30eb2b2 [LowerMatrixIntrinsics] Call getRegisterClassForType before getNumberOfRegisters.
getNumberOfRegisters takes a ClassID as it's argument. It shouldn't be passed a bool. Assuming the bool meant vector or not, we should call getRegisterClassForType first.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D116903
2022-01-10 15:32:13 -08:00
Kazu Hirata
b932bdf59f [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-07 17:45:09 -08:00
Simon Pilgrim
5e7912d80f [LowerMatrixIntrinsics] writeFnName - don't dereference a dyn_cast<>. NFC.
dyn_cast<> can return null - use cast<> instead to assert the cast is valid before dereferencing the casted pointer.

Fixes static-analyzer null dereference warning.
2022-01-06 17:09:32 +00:00
Kazu Hirata
e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887ee47f3ec8a030e9211169ef4fb094c3.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Kazu Hirata
fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Markus Lavin
1ac209ed76 [NPM] Added -print-pipeline-passes print params for a few passes.
Added '-print-pipeline-passes' printing of parameters for those passes
declared with *_WITH_PARAMS macro in PassRegistry.def.

Note that it only prints the parameters declared inside *_WITH_PARAMS as
in a few cases there appear to be additional parameters not parsable.

The following passes are now covered (i.e. all of those with *_WITH_PARAMS in
PassRegistry.def).

LoopExtractorPass - loop-extract
HWAddressSanitizerPass - hwsan
EarlyCSEPass - early-cse
EntryExitInstrumenterPass - ee-instrument
LowerMatrixIntrinsicsPass - lower-matrix-intrinsics
LoopUnrollPass - loop-unroll
AddressSanitizerPass - asan
MemorySanitizerPass - msan
SimplifyCFGPass - simplifycfg
LoopVectorizePass - loop-vectorize
MergedLoadStoreMotionPass - mldst-motion
GVN - gvn
StackLifetimePrinterPass - print<stack-lifetime>
SimpleLoopUnswitchPass - simple-loop-unswitch

Differential Revision: https://reviews.llvm.org/D109310
2021-09-15 08:34:04 +02:00
Kazu Hirata
8e86c0e4f4 [Scalar] Use make_early_inc_range (NFC) 2021-09-12 08:17:18 -07:00
Florian Hahn
f999312872
Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts the revert 28c04794df74ad3c38155a244729d1f8d57b9400.

The failing MLIR test that caused the revert should be fixed  in this
version.

Also includes a PPC test fix previously in 1f87c7c478a6.
2021-08-12 18:31:57 +01:00
Mehdi Amini
28c04794df Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts commit a1ef81de35a4bac6d3b22e9d7186d880124d7a55.

Broke the MLIR buildbot.
2021-08-12 11:57:19 +00:00
Florian Hahn
a1ef81de35
[Matrix] Overload stride arg in matrix.columnwise.load/store.
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.

This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.

Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.

Fixes PR51304.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D107349
2021-08-12 10:45:25 +01:00
Adam Nemet
d87d3615f7 [Matrix] Fix shape for factored transpose
The shape of the input is C x R.

Differential Revision: https://reviews.llvm.org/D106722
2021-07-27 11:36:13 -07:00
Adam Nemet
bf7eb48454 [Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo
As an instruction is replaced in optimizeTransposes RAUW will replace it in
the ShapeMap (ShapeMap is ValueMap so that uses are updated).  In
finalizeLowering however we skip updating uses if they are in the ShapeMap
since they will be lowered separately at which point we pick up the lowered
operands.

In the testcase what happened was that since we replaced the doubled-transpose
with the shuffle, it ended up in the ShapeMap.  As we lowered the
columnwise-load the use in the shuffle was not updated.  Then as we removed
the original columnwise-load we changed that to an undef.  I.e. we ended up
with:

```
%shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32>
                                   ^^^^^
                                  <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6>
```

Besides the fix itself, I have fortified this last bit.  As we change uses to
undef when removing instruction we track the undefed instruction to make sure
we eventually remove those too.  This would have caught the issue at compile
time.

Differential Revision: https://reviews.llvm.org/D106714
2021-07-27 11:36:13 -07:00
Fangrui Song
3b181568db [Matrix] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build after D106457. NFC 2021-07-22 11:33:02 -07:00