7 Commits

Author SHA1 Message Date
Alex Voicu
6e0b0038cd
[clang][OpenCL][CodeGen][AMDGPU] Do not use private as the default AS for when generic is available (#112442)
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
2024-10-22 12:05:48 +01:00
Matt Arsenault
5822cc271b
clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (#96875) 2024-08-20 23:20:03 +04:00
Matt Arsenault
ce132a58b8
clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (#96873) 2024-08-20 23:01:15 +04:00
Matt Arsenault
b5e63cc533
clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (#96872)
Need to emit syncscope and new metadata to get the native instruction,
most of the time.
2024-08-15 22:59:24 +04:00
Matt Arsenault
76894c5e6e
clang/AMDGPU: Emit atomicrmw from ds_fadd builtins (#95395)
We should have done this for the f32/f64 case a long time ago. Now that
codegen handles atomicrmw selection for the v2f16/v2bf16 case, start emitting
it instead.

This also does upgrade the behavior to respect a volatile qualified pointer,
which was previously ignored (for the cases that don't have an explicit
volatile argument).
2024-06-18 20:51:14 +02:00
Fangrui Song
7c1d9b15ee [test] %clang_cc1: remove redundant actions 2024-05-04 23:08:11 -07:00
Mariusz Sikora
3e6589f21c
[AMDGPU][GFX12] Add 16 bit atomic fadd instructions (#75917)
- image_atomic_pk_add_f16
- image_atomic_pk_add_bf16
- ds_pk_add_bf16
- ds_pk_add_f16
- ds_pk_add_rtn_bf16
- ds_pk_add_rtn_f16
- flat_atomic_pk_add_f16
- flat_atomic_pk_add_bf16
- global_atomic_pk_add_f16
- global_atomic_pk_add_bf16
- buffer_atomic_pk_add_f16
- buffer_atomic_pk_add_bf16
2024-01-18 14:01:09 +01:00