llvm-project

shylie/llvm-project

Fork 0

Commit Graph

Author	SHA1	Message	Date
erman-gurses	3f37df5b71	[reland][mlir][amdgpu] Shared memory access optimization pass (#79164 ) - Reland: https://github.com/llvm/llvm-project/pull/75627 - Reproduced then fixed the build issue	2024-01-25 07:44:45 -08:00
Mehdi Amini	e611a4cf80	Revert "[mlir][amdgpu] Shared memory access optimization pass" (#78822 ) Reverts llvm/llvm-project#75627 ; it broke the bot: https://lab.llvm.org/buildbot/#/builders/61/builds/53218	2024-01-19 16:41:43 -08:00
erman-gurses	b7360fbe8c	[mlir][amdgpu] Shared memory access optimization pass (#75627 ) It implements transformation to optimize accesses to shared memory. Reference: https://reviews.llvm.org/D127457 _This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by newColIdx = col % vecSize + perm[row](col / vecSize, row) where perm is a permutation function indexed by row and vecSize is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory._	2024-01-19 15:44:45 -08:00

Author

SHA1

Message

Date

erman-gurses

3f37df5b71

[reland][mlir][amdgpu] Shared memory access optimization pass (#79164 )

- Reland: https://github.com/llvm/llvm-project/pull/75627

- Reproduced then fixed the build issue

2024-01-25 07:44:45 -08:00

Mehdi Amini

e611a4cf80

Revert "[mlir][amdgpu] Shared memory access optimization pass" (#78822 )

Reverts llvm/llvm-project#75627 ; it broke the bot:
https://lab.llvm.org/buildbot/#/builders/61/builds/53218

2024-01-19 16:41:43 -08:00

erman-gurses

b7360fbe8c

[mlir][amdgpu] Shared memory access optimization pass (#75627 )

It implements transformation to optimize accesses to shared memory.

Reference: https://reviews.llvm.org/D127457

_This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by newColIdx = col % vecSize + perm[row](col / vecSize, row)
where perm is a permutation function indexed by row and vecSize
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized
accesses
common to loading matrix multiplication operands to/from shared memory._

2024-01-19 15:44:45 -08:00

3 Commits