As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
ValueMap automatically updates entries with the new value if they have
been RAUW. This can lead to instructions that are expected to not have
shape info to be added to the map (e.g. shufflevector as in the added
test case).
This leads to incorrect results. Originally it was used for transpose
optimizations, but they now all use updateShapeAndReplaceAllUsesWith,
which takes care of updating the shape info as needed.
This fixes a crash in the newly added test cases.
PR: https://github.com/llvm/llvm-project/pull/118282
lowerDotProduct called above may already lower a matrix multiply and
mark it as procssed by adding it to FusedInsts. Don't try to process it
again in LowerMatrixMultiplyFused by checking if FusedInsts.
Without this change, we trigger an assertion when trying to erase the
same original matrix multiply twice.
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
At the moment, loads introduced by multiply fusion may be placed after
an objects lifetime has been terminated by lifetime.end. This introduces
reads to dead objects.
To avoid this, first collect all lifetime.end calls in the function.
During fusion, we deal with any lifetime.end calls that may alias any of
the loads.
Such lifetime.end calls are either moved when possible (both the
lifetime.end and the store are in the same block) or deleted.
PR: https://github.com/llvm/llvm-project/pull/84914
Row and column arguments for matrix_transpose indicate the shape of the
operand. When hoisting the transpose to the result of the add, the add
operates on the original operand's shape, and so does the hoisted
transpose.
This patch also adds an assert that the shape for the original add and
the transpose match, as well as the shape of the new add matches the
cached shape for it.
The assert could potentially be moved to
updateShapeAndReplaceAllUsesWith.
Generalize the logic used to convert column-vector ops to row-vectors to
support converting chains of operations.
A potential next step is to further generalize this to convert
column-vector ops to row-vector ops in general, not just for operands of
dot products. Dot-product handling would then be driven by the general
conversion, rather than the other way around.
PR: https://github.com/llvm/llvm-project/pull/72647
This patch adds #include "llvm/ADT/SmallSet.h" to a couple of files
that are relying on transitive includes of SmallSet.h. It in turn
unblocks the removal of unnecessary includes of llvm/ADT/SmallSet.h in
several other files.
These values don't propagate to the output; they are always replaced with a subsequent shuffle
or insertelement.
Tested equivalence with Alive2, e.g., https://alive2.llvm.org/ce/z/fj4s78.
Partial progress towards removing in-tree uses of `getPointerTo()`,
by employing the following options:
* Drop the call entirely if the sole purpose of it is to support
a no-op bitcast (remove the no-op bitcast as well).
* Replace with `PointerType::get()`/`PointerType::getUnqual()`.
Also, remove no-op function `EmitBitCastOfLValueToProperType()`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D154392
The dot product lowering will use the left operand as row vector.
If the operand is a binary op, convert it to operate on a row vector
instead of a column vector.
Depends on D148428.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D148429
Extend dot-product handling to skip transposes of the first operand. As
this is a vector, the conversion between column and row vector via the
transpose isn't needed.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D148428
At the moment, lower-matrix-intrinsics accepts mis-matches between
shapes for operations. See shape-verification.ll for an example where
@llvm.matrix.column.major.load specifies 6x1 and then the use
(@llvm.matrix.multiply) specifies the operand to have 1x6.
This patch adds verification for shapes to check if shapes match.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D147438
The current code did not properly account for integer matrixes. Check
if the operands are floating point or integer matrixes and use FAdd/Add
accordingly.
This is already done for other cases, like multiplies.
Fixes#62281.
Perform dot-product lowering before instruction fusion to avoid crash in
newly added test. Also update lowerDotProduct to properly mark optimized
matmul as fused.
Limit to dot product lowering to column major matrixes for now. This
simplifies the code and reasoning for upcoming planned improvements.
Support for row-major matrixes can be added later as extension.
Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many `shufflevector` instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions.
Reviewed By: fhahn, thegameg
Differential Revision: https://reviews.llvm.org/D131125
First, sink the transposes to the operands to simplify redudant
ones. Then, lift them to reduce the number of realized transposes.
```
(A + B)^T -> A^T + B^T -> (A + B)^T
```
See tests for more examples.
Differential Revision: https://reviews.llvm.org/D133657
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.
No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
If one of the operands is a transposed splat, the transpose can be
removed.
This is useful to simplify when transposes are distributed to operands
of a matmul:
* k^T -> k
* (A * k)^t -> A^t * k
Differential Revision: https://reviews.llvm.org/D130177