llvm-project

Author	SHA1	Message	Date
Jianhui Li	401ba6df84	[MLIR][XeGPU] Add Layout Propagation support for multi-reduction/reduction op with scalar result (#189133 ) This PR add Layout Propagation support for multi-reduction/reduction op with scalar result: 1) Enhance setupMultiReductionResultLayout() and LayoutInfoPropagation::visitVectorMultiReductionOp() to support scalar result 2) Add propagation support for vector.reduction op at the lane level, since the op is only introduced at the lane level.	2026-04-01 13:01:34 -07:00
Andrey Pavlenko	44c6a0acb7	[MLIR][XeGPU] Fix dpas f16 output layout (#184419 ) The layout propagation fails if dpas has an f16 accumulator. This fix resolves the issue by removing the packingSize argument which seems not valid here.	2026-03-20 20:28:26 +00:00
Jianhui Li	f5e2238a3e	[MLIR][XeGPU] Enhance multi-reduction layout propagation rules (#186308 ) This PR enhance the multi-reduction layout propagation: 1. improve inst_data and lane_data to support fractional subgroup size 2. improve subgroup_layout/data setup to utilize the (nested) slice layout from consumer op It also removes the restriction in load_matrix/store_matrix layout propagation to allow nd (n>2) layout	2026-03-20 08:12:32 -07:00
Jianhui Li	c6395bb287	[MLIR][XeGPU] Enhance Layout Propagation for broadcasting both leading dimensions and inner unit dimensions (#185583 ) This PR enhances the layout propagation rules for broadcast operations. The source layout is derived from the result layout based on the broadcast pattern: 1. Broadcast on leading dimensions The source layout is the slice layout of the result layout. 2. Broadcast on inner unit dimensions The source layout matches the result layout, with sg_data and lane_data set to 1. 3. Broadcast on both leading dimensions and inner unit dimensions The source layout is derived by combining the above two rules.	2026-03-12 20:23:59 -07:00
Jianhui Li	fe11a43c60	[MLIR][XeGPU] Enhancing insert_strided_slice layout setup and infer rules (#184742 ) This PR enhances insert_strided_slice layout rules to handle slice layout and adjust the layout to fit the src shape. It adds dropDims as layout utility function.	2026-03-06 17:08:23 -08:00
Jianhui Li	34259b76bf	[MLIR][XeGPU] Refactoring Transpose OP Layout Propagation (#184702 ) This PR refactors Transpose Op Layout Propagation: 1. Add inferTransposeSourceLayout() to layout utility, enhance layout propagation and conflict handling to use this function 2. Add Layout utility: TransposeDims() 3. Refactor IsTransposeOf() and fix minor bugs 4. Fix minor issue in dropSgLayoutAndData()	2026-03-05 15:03:49 -08:00
Nishant Patel	8774da8f2f	[MLIR][XeGPU] Preserve anchor layouts in recoverTemporaryLayout (#182186 )	2026-03-01 15:43:01 -08:00
Jianhui Li	77600cbd97	[MLIR][XeGPU] XeGPU Layout adds support for fractional-subgroup-size vector (#183434 ) This PR enhances the layout assignment for XeGPU load/store operations to handle vector size smaller than subgroup size. Say for vector[4], in case of lane_data=[1], lane_layout=[4] and inst_data=[4]. The fractional-subgroup-size vector support is required to support the cross-subgroup reduction case. The number of participant subgroups in reduction can be small, so it causes each subgroup needs to reduce a small vector size, often a fraction of subgroup size. Most layout-based subgroup distribution patterns support fraction-subgroup-size without no change except a few: reduction, insert/extract, constant. We don't expect ND operations (like load_nd/store_nd/dpas) accept fractional-subgroup-size vector.	2026-02-26 19:49:33 -08:00
Charitha Saumya	84594d7539	[mlir][xegpu] Add vector layout conflict handling in XeGPU layout propagation pass. (#182402 ) This PR adds support for layout conflict handling for vector operands. A conflict for a vector operand occurs when a value consumed at a given operand is not in the expected layout in the context of the consumer (for example `vector.multi_reduction` op's source require a specific layout inferred from its current result layout). To resolve this conflict, we insert an `xegpu.convert_layout` right after the producer (essentially duplicating the producer with expected layout) and use the new value in the consumer.	2026-02-25 12:38:33 -08:00
Artem Kroviakov	4226250a42	[MLIR][XeGPU] Fix matrix ops layout propagation (#182268 )	2026-02-22 13:40:48 +01:00
Nishant Patel	14f20ce795	[MLIR][XeGPU] Remove layout attribute from scf ops after wg to sg (#180771 )	2026-02-12 07:26:18 -08:00
Artem Kroviakov	760f70711a	[MLIR][XeGPU] Use the `setupDpasLayout` utility for dpas layout propagation (#180937 )	2026-02-12 13:18:47 +01:00
Nishant Patel	570055bf97	[MLIR][XeGPU] Propagate layout from anchor ops before Wg To Sg & Blocking Pass (#179490 ) This PR calls recoverTemporaryLayout before the XeGPUWgtoSgDistribute & XeGPUBlocking Pass to recover all the temporary operand layout which might be required by the transformation patterns for checks and verification	2026-02-06 15:56:09 -08:00
Jianhui Li	8102ebf6a3	[MLIR][XeGPU] Fixing PR179016 minor issues (#180295 ) Fix two issues brough by PR179016: 1. unused variable if build the option with "DLLVM_ENABLE_ASSERTIONS=OFF" 2. Recover modification to recoverTemporaryLayouts() brought by PR176737. Unintentionally lost during the merging process.	2026-02-06 14:51:40 -08:00
Jianhui Li	61b8a57839	[MLIR][XeGPU] Refactor layout propagation utilities (#179016 ) This PR refactors layout propagation into two distinct components: result/anchor layout setup and source layout inference from the result. For operations that require a specific result layout due to semantic or hardware constraints, the propagation logic explicitly sets up the result or anchor layout. Otherwise, it infers the source layout from the backward-propagated consumer layout. The result or anchor layout may differ from the backward-propagated consumer layout; any such discrepancies are resolved via the existing layout-conflict mechanism. This PR introduces the following utility functions: Source layout inference: > inferBroadcastSourceLayout() > inferMultiReductionSourceLayout() > inferBitCastSourceLayout() > inferShapeCastSourceLayout() > inferInsertStridedSliceSourceLayout() Result / anchor layout setup: > setupMultiReductionResultLayout() > setupBitCastResultLayout() > setupInsertStridedSliceResultLayout() > setupLoadMatrixAnchorLayout() > setupStoreMatrixAnchorLayout() > setupLoadGatherAnchorLayout() > setupStoreScatterAnchorLayout() Part of subgroup distribution related code changes are separated and created as PR https://github.com/llvm/llvm-project/pull/179018/changes.	2026-02-05 19:26:25 -08:00

15 Commits