If a dimension is not tiled, it is always valid to fuse the pack op,
even if it has padding semantics. Because it always generates a full
slice along the dimension.
If a dimension is tiled and it does not need extra padding, the fusion
is valid.
The revision also formats corresponding tests for consistency.
---------
Signed-off-by: hanhanW <hanhan0912@gmail.com>
This patch makes the `__failed` lambda a member function on `fstream`.
This fixes two LLDB expression evaluation test failures that got
introduced with https://github.com/llvm/llvm-project/pull/147389:
```
16:22:51 ********************
16:22:51 Unresolved Tests (2):
16:22:51 lldb-api :: commands/expression/import-std-module/list-dbg-info-content/TestDbgInfoContentListFromStdModule.py
16:22:51 lldb-api :: commands/expression/import-std-module/list/TestListFromStdModule.py
```
The expression evaluator is asserting in the Clang parser:
```
Assertion failed: (capture_size() == Class->capture_size() && "Wrong number of captures"), function LambdaExpr, file ExprCXX.cpp, line 1277.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
```
Ideally we'd figure out why LLDB is falling over on this lambda. But to
unblock CI for now, make this a member function.
In the long run we should figure out the LLDB bug here so libc++ doesn't
need to care about whether it uses lambdas like this or not.
ArrayRef(std::nullopt_t) has been deprecated. This patch replaces
std::nullopt with {}.
A subsequence patch will address those places where we need to replace
std::nullopt with mlir::TypeRange{} or mlir::ValueRange{}.
This PR addresses the following issues.
1. Add the missing attributes when creating a new GPU funcOp in
`MoveFuncBodyToWarpExecuteOnLane0` pattern.
2. Bug fix in LoadNd distribution to make sure LoadOp is the last op in
warpOp region before it is distributed (needed for preserving the memory
op ordering during distribution).
3. Add utility for removing OpOperand or OpResult layout attributes.
Fixes#149179
The issue is that `Builder.CreateGEP` does not return a GEP Instruction
or GEP ContantExpr when the pointer operand is a global variable and all
indices are constant zeroes.
This PR ensures that a GEP instruction is created if `Builder.CreateGEP`
did not return a GEP.
Fixes#149180
This PR removes an assertion that triggered on valid IR. It has been
replaced with an if statement that returns early if the conditions are
not correct.
This PR also adds GEPs to scalar loads and stores from/to global
variables.
This op doesn't have any rank or indices restrictions on src/dst
memrefs, but was using `SameVariadicOperandSize` which was causing
issues. Also fix some other issues while we at it.
Add support for reading LLVM IR from stdin in the llvm-ir2vec tool.
This allows usage of the tool in pipelines where LLVM IR is generated or transformed on-the-fly just like the other llvm tools. Useful in upcoming PRs.
(Tracking issue - #141817)
Add helper methods to IR2Vec's Vocabulary class for numeric ID mapping and vocabulary size calculation. These APIs will be useful in triplet generation for `llvm-ir2vec` tool (See #149214).
(Tracking issue - #141817)
Deeply nested structs can be noisy, so Apple's LLDB fork sets the
default to `4`:
9c93adbb28/lldb/source/Target/TargetProperties.td (L134-L136)
Thought it would be useful to upstream this. Though happy to pick a
different default or keep it as-is.
071765749a70b22fb62f2efc07a3f242ff5b4c52 improved constexpr-unknown
diagnostics, but potential constant expression checking broke in the
process: we produce diagnostics in more cases. Suppress the diagnostics
as appropriate.
This fix affects -Winvalid-constexpr and the enable_if attribute. (The
-Winvalid-constexpr diagnostic isn't really important right now, but it
will become important if we allow constexpr-unknown with pre-C++23
standards.)
Fixes#149041. Fixes#149188.
Some platforms print `{anonymous}` instead of the other two forms
accepted by the test regex. This PR just removes the attempt to guess
how the anonymous namespace will be printed.
@Kewen12 is there a way to trigger the particular CIs that failed in
https://github.com/llvm/llvm-project/pull/146228 on this PR?
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
287b24e1899eb6ce62eb9daef5a24faae5e66c1e moved the
`GetGlobalAddressInformation` call earlier, but this broke a chromium
test, so make this workaround for AIX only.
Add embedding generation functionality to the llvm-ir2vec tool, complementing the existing triplet generation mode.
This change completes the IR2Vec tool by adding the embedding generation functionality, which was previously mentioned as a TODO item. The tool now supports both triplet generation for vocabulary training and embedding generation using a trained vocabulary.
Add a new LLVM tool `llvm-ir2vec`. This tool is primarily intended to generate triplets for training the vocabulary (#141834) and to potentially generate the embeddings in a stand alone manner.
This PR introduces the tool with triplet generation functionality. In the upcoming PRs I'll add scripts under `utils/mlgo` to complete the vocabulary tooling. #147844 adds embedding generation logic to the tool.
(Tracking issue - #141817)
The following code is now accepted:
```
module m
end
program m
use m
end
```
The PROGRAM name doesn't really have an effect on the compilation
result, so it shouldn't result in symbol name conflicts.
This change makes the main program symbol name all uppercase in the
cooked character stream. This makes it distinct from all other symbol
names that are all lowercase in cooked character stream.
Modified the tests that were checking for lower case main program name.
This change modifies CI scripts to add a pseudo-project for CIR and
detect when CIR-specific files are modified. It also enables building
clang with CIR enabled whenever both the clang and mlir projects are
being built.
Building and testing CIR is only enabled on Linux at this time, as CIR
doesn't properly support Windows or MacOS yet.
This patch adds support for scalable vectorization of linalg.mmt4d. The
key design change is the introduction of a new vectorizer state variable:
* `assumeDynamicDimsMatchVecSizes`
...along with the corresponding Transform dialect attribute:
* `assume_dynamic_dims_match_vec_sizes`.
This flag instructs the vectorizer to assume that dynamic memref/tensor
dimensions match the corresponding vector sizes (fixed or scalable). With this
assumption, masking becomes unnecessary, which simplifies the lowering pipeline
significantly.
While this assumption is not universally valid, it typically holds for
`linalg.mmt4d`. Inputs and outputs are explicitly packed using `linalg.pack`,
and this packing includes padding, ensuring that dimension sizes align with
vector sizes (*).
* Related discussion: https://github.com/llvm/llvm-project/issues/143920
An upcoming patch will include an end-to-end test that leverages scalable
vectorization of linalg.mmt4d to demonstrate the newly enabled functionality.
This would not be feasible without the changes introduced here, as it would
otherwise require additional logic to handle complex - but ultimately redundant
- masks.
(*) This holds provided that the tile sizes used for packing match the vector
sizes used during vectorization. It is the user’s responsibility to enforce
this.