The PR makes the following refine changes to the XeGPU dialect.
1. Separated the old `TensorDescAttr` into two independent attributes: `BlockTensorDescAttr` and `ScatterTensorDescAttr`
2. Renamed the `MemoryScopeAttr` to `MemorySpaceAttr` and updated the enumeration value for shared memory following OpenCL standard.
3. Introduced `transpose` UnitAttr to `StoreScatterOp`and `LoadGatherOp`
4. Added memory space check for `CreateNdDesc` and `CreateDesc` op, as well as valid and invalid test cases for them.
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
This PR has following changes/fixes to XeGPU definition:
- Fix type print format for atomic_rmw
- removed 2D support for MaskType
- Update LoadNd definition
- Add 1D TensorDesc support
- Replaced vnni_axis attribute with packed attribute
- Update DPAS op definition, limiting A to 2D vector, and B to either 2D/3D vector.
This patch fixes:
mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp:58:2: error: extra ';'
outside of a function is incompatible with C++98
[-Werror,-Wc++98-compat-extra-semi]
This PR adds XeGPU 2D block operators. It contains:
1. TensorDescType and TensorDescAttr definitions
2. MemoryScopeAttr and CacheHintAttr definitions which are used by
TensorDescAttr.
3. CreateNdDescOp, PrefetchNdOp, LoadNdOp, and StoreNdOp definitions,
and their corresponding testcases for illustration.
It cherry-picks daebe5c4f27ba140ac8d13abf41e3fe4db72b91a with asan fix.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Hi @joker-eph, This PR adds XeGPU 2D block operators. It contains:
1. `TensorDescType` and `TensorDescAttr` definitions
2. `MemoryScopeAttr` and `CacheHintAttr` definitions which are used by
`TensorDescAttr`.
3. `CreateNdDescOp`, `PrefetchNdOp`, `LoadNdOp`, and `StoreNdOp`
definitions, and their corresponding testcases for illustration.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
This PR follows our previous [RFC
](https://discourse.llvm.org/t/rfc-add-xegpu-dialect-for-intel-gpus/75723)
to add XeGPU dialect definition for Intel GPUs. It contains dialect,
type, attributes and operators definitions, as well as testcases for
semantic checks. The lowering and optimization passes will be issued
with separated passes.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>