llvm-project

History

Krzysztof Drewniak e7dd7b81ac

[AMDGPU] tensor_{load_to/store_from}_lds => ..._d2 simplification (#171540 )

This commit adds the rewrite

```
llvm.amdgcn.tensor.{load.to/store.from}.lds(
  <4 x i32> %d0, <8 x i32> %d1, <4 x i32> zeroinitializer,
  <4 x i32> zeroinitializer, i32 [cachepolicy])
=>
llvm.amdgcn.tensor.{load.to/store.from}.lds.d2(
  <4 x i32> %$d0, <8 x i32> %d1, i32 [cachepolicy])
```

This is justifed because, when the short encoding that uses the NULL
SGPR for registers 2 and 3 is used, the hardware acts as if those
registers were 0, including in the gather mode.

It is always safe not to run this transformation.

(Note: tests were LLM'd and then tweaked.)

2025-12-15 08:11:03 -08:00

addrspacecast.ll

…

amdgcn-demanded-vector-elts-inseltpoison.ll

…

amdgcn-demanded-vector-elts.ll

…

amdgcn-intrinsics-gfx8.ll

…

amdgcn-intrinsics.ll

[AMDGPU] Add the support for 45-bit buffer resource (#159702 )

2025-09-24 11:12:02 -04:00

amdgcn-simplify-image-buffer-stores.ll

…

bitcast-fold-lane-ops.ll

…

fma_legacy.ll

…

fmed3-fpext-fold.ll

…

fmed3.ll

[InstSimplify] Optimize maximumnum and minimumnum (#139581 )

2025-10-07 14:23:32 +01:00

fmul_legacy.ll

…

image-d16.ll

…

issue68120.ll

…

lane-index-simplify-demanded-bits.ll

…

lit.local.cfg

…

llvm.amdgcn.readfirstlane.ll

…

llvm.amdgcn.readlane.ll

…

llvm.amdgcn.wavefrontsize.ll

…

mbcnt.ll

…

memcpy-from-constant.ll

[InstCombine] Strip leading zero indices from GEP (#155415 )

2025-09-01 09:58:11 +02:00

mfma-scale.ll

…

permlane64.ll

…

phi-with-incoming-from-load.ll

…

ptr-replace-alloca.ll

InstCombine: Check GEP operand is available (#160438 )

2025-09-25 17:20:20 +09:00

rcp-contract-rsq.ll

…

select-from-load.ll

…

simplify-amdgcn.cvt.off.f32.i4.ll

…

simplify-demanded-vector-elts-lane-intrinsics.ll

…

tan.ll

…

tensor-load-store-lds.ll

[AMDGPU] tensor_{load_to/store_from}_lds => ..._d2 simplification (#171540 )

2025-12-15 08:11:03 -08:00

trivially-uniform.ll

…

wmma-f8f6f4.ll

[AMDGPU] gfx1250 v_wmma_scale[16]_f32_16x16x128_f8f6f4 codegen (#152036 )

2025-08-04 19:16:34 -07:00