llvm-project

Author	SHA1	Message	Date
Petr Kurapov	bc29fc937c	[MLIR] Create GPU utils library & move distribution utils (#119264 ) Continue the move of `warp_execute_on_lane_0` op to the gpu dialect (#116994). This patch creates a utils library in GPU and moves generic helper functions there.	2024-12-13 10:26:57 +01:00
Andrea Faulds	a800ffac41	[mlir][gpu] Disjoint patterns for lowering clustered subgroup reduce (#109158 ) Making the existing populateGpuLowerSubgroupReduceToShufflePatterns() function also cover the new "clustered" subgroup reductions is proving to be inconvenient, because certain backends may have more specific lowerings that only cover the non-clustered type, and this creates pass ordering constraints. This commit removes coverage of clustered reductions from this function in favour of a new separate function, which makes controlling the lowering much more straightforward.	2024-09-18 15:55:53 -04:00
Andrea Faulds	fd26f8444a	[mlir][gpu] Rename two misspelled pattern population functions (#109015 )	2024-09-17 15:26:14 -04:00
Andrea Faulds	3d01f0a33b	[mlir][gpu] Add 'cluster_stride' attribute to gpu.subgroup_reduce (#107142 ) Follow-up to 7aa22f013e24d20291aad745368ff907baa9dfa4, adding an additional attribute needed in some applications.	2024-09-05 09:03:22 -04:00
Andrea Faulds	7aa22f013e	[mlir][gpu] Add 'cluster_size' attribute to gpu.subgroup_reduce (#104851 ) This enables performing several reductions in parallel, each smaller than the size of the subgroup. One potential application is flash attention with subgroup-wide matrix multiplication and reduction combined in one kernel. The multiplication operation requires a 2D matrix to be distributed over the lanes of the subgroup, which then constrains the shape the following reduction can have if we want to keep data in registers.	2024-08-20 13:37:03 -04:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Jakub Kuderski	c0345b4648	[mlir][gpu] Add subgroup_reduce to shuffle lowering (#76530 ) This supports both the scalar and the vector multi-reduction cases.	2024-01-02 16:14:22 -05:00
Jakub Kuderski	2af186f9bd	[mlir][gpu] Add patterns to break down subgroup reduce (#76271 ) The new patterns break down subgroup reduce ops with vector values into a sequence of subgroup reductions that fit the native shuffle size. The maximum/native shuffle size is parametrized. The overall goal is to be able to perform multi-element reductions with a sequence of `gpu.shuffle` ops.	2023-12-28 14:39:46 -05:00

8 Commits