llvm-project

Author	SHA1	Message	Date
Jacques Pienaar	09dfc5713d	[mlir] Enable decoupling two kinds of greedy behavior. (#104649 ) The greedy rewriter is used in many different flows and it has a lot of convenience (work list management, debugging actions, tracing, etc). But it combines two kinds of greedy behavior 1) how ops are matched, 2) folding wherever it can. These are independent forms of greedy and leads to inefficiency. E.g., cases where one need to create different phases in lowering and is required to applying patterns in specific order split across different passes. Using the driver one ends up needlessly retrying folding/having multiple rounds of folding attempts, where one final run would have sufficed. Of course folks can locally avoid this behavior by just building their own, but this is also a common requested feature that folks keep on working around locally in suboptimal ways. For downstream users, there should be no behavioral change. Updating from the deprecated should just be a find and replace (e.g., `find ./ -type f -exec sed -i 's\|applyPatternsAndFoldGreedily\|applyPatternsGreedily\|g' {} \;` variety) as the API arguments hasn't changed between the two.	2024-12-20 08:15:48 -08:00
Kunwar Grover	8e66303916	[mlir][Vector] Remove trivial uses of vector.extractelement/vector.insertelement (1/N) (#116053 ) This patch removes trivial usages of vector.extractelement/vector.insertelement. These operations can be fully represented by vector.extract/vector.insert. See https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops/71116 for more information. Further patches will remove more usages of these ops.	2024-11-13 15:45:59 +00:00
Jakub Kuderski	763bc9249c	[mlir][amdgpu] Align Chipset with TargetParser (#107720 ) Update the Chipset struct to follow the `IsaVersion` definition from llvm's `TargetParser`. This is a follow up to https://github.com/llvm/llvm-project/pull/106169#discussion_r1733955012. * Add the stepping version. Note: This may break downstream code that compares against the minor version directly. * Use comparisons with full Chipset version where possible. Note that we can't use the code in `TargetParser` directly because the chipset utility is outside of `mlir/Target` that re-exports llvm's target library.	2024-09-09 11:12:26 -04:00
Giuseppe Rossini	1387ba48a3	[MLIR][AMDGPU] Introduce fp16 packed arithmetic (#105688 ) This PR is introducing rocdl.cvt.pkrtz in the ROCDL dialect and it is using that instruction when lowering `arith::TruncFOp`.	2024-08-26 12:48:57 -05:00
Rob Suderman	f35318e828	[mlir][amdgpu] Add support for multi-dim arith.truncf/extf fp8 lowering (#98074 ) The existing `fp8` lowering from `arith` to `amdgpu` bails out on the multidimensional case. We can handle this by `vector.shape_cast` collapsing to the 1-D case on extraction and re-casting back to the desired output shape.	2024-07-09 14:59:58 -07:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Christian Sigg	a5757c5b65	Switch member calls to `isa/dyn_cast/cast/...` to free function calls. (#89356 ) This change cleans up call sites. Next step is to mark the member functions deprecated. See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-19 15:58:27 +02:00
Victor Perez	8827ff92b9	[MLIR][Arith] Add rounding mode attribute to `truncf` (#86152 ) Add rounding mode attribute to `arith`. This attribute can be used in different FP `arith` operations to control rounding mode. Rounding modes correspond to IEEE 754-specified rounding modes. Use in `arith.truncf` folding. As this is not supported in dialects other than LLVM, conversion should fail for now in case this attribute is present. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-04-01 11:57:14 +02:00
Hugo Trachino	65066c0277	[mlir] Use `create` instead of `createOrFold` for ConstantOp as folding has no effect (NFC) (#80129 ) This aims to clean-up confusing uses of builder.createOrFold<ConstantOp> since folding of constants fails.	2024-01-31 23:40:37 -08:00
Krzysztof Drewniak	750e90e440	[mlir][ArithToAMDGPU] Add option for saturating truncation to fp8 (#74153 ) Many machine-learning applications (and most software written at AMD) expect the operation that truncates floats to 8-bit floats to be saturatinng. That is, they expect `truncf 256.0 : f32 to f8E4M3FNUZ` to yield `240.0`, not `NaN`, and similarly for negative numbers. However, the underlying hardware instruction that can be used for this truncation implements overflow-to-NaN semantics. To enable handling this usecase, we add the saturate-fp8-truncf option to ArithToAMDGPU (off by default), which causes the requisite clamping code to be emitted. Said clamping code ensures that Inf and NaN are passed through exactly (and thus trancate to NaN). Per review feedback, this commit efactors createScalarOrSplatConstant() to the Arith dialect utilities and uses it in this code. It also fixes naming of existing patterns and switches from vector.extractelement/insertelement to vector.extract/insert.	2024-01-23 16:52:21 -06:00
Krzysztof Drewniak	2ebd633f14	[mlir][AMDGPU] Add packed 8-bit float conversion ops and lowering Define operations that wrap the gfx940's new operations for converting between f32 and registers containing packed sets of four 8-bit floats. Define rocdl operations for the intrinsics and an AMDGPU dialect wrapper around them (to account for the fact that MLIR distinguishes the two float formats at the type level but that the LLVM IR does not). Define an ArithToAMDGPU pass, meant to run before conversion to LLVM, that replaces relevant calls to arith.extf and arith.truncf with the packed operations in the AMDGPU dialect. Note that the conversion currently only handles scalars and vectors of rank <= 1, as we do not have a usecase for multi-dimensional vector support right now. Reviewed By: jsjodin Differential Revision: https://reviews.llvm.org/D152457	2023-09-28 14:44:16 +00:00

11 Commits