## Summary
Allowing implicit compatibility between `_Float16` vector types and
`half` vector types in OpenCL mode. This enables AMDGPU builtins to work
correctly across OpenCL, HIP, and C++ without requiring separate builtin
definitions.
## Problem Statement
When using AMDGPU image builtins that return half-precision vectors in
OpenCL, users encounter type incompatibility errors:
**Builtin Definition:**
`TARGET_BUILTIN(__builtin_amdgcn_image_load_1d_v4f16_i32, "V4xiiQtii",
"nc", "image-insts")`
**Test Case:**
```
typedef half half4 __attribute__((ext_vector_type(4)));
half4 test_builtin_image_load_1d_2(half4 v4f16, int i32, __amdgpu_texture_t tex) {
return __builtin_amdgcn_image_load_1d_v4f16_i32(100, i32, tex, 120, i32);
}
```
**Error:**
```
error: returning '__attribute__((__vector_size__(4 * sizeof(_Float16)))) _Float16'
(vector of 4 '_Float16' values) from a function with incompatible result type
'half4' (vector of 4 'half' values)
```
## Solution
In OpenCL, allow implicit compatibility between `_Float16` vector types
and `half` vector types. This is needed for AMDGPU builtins that may
return _Float16 vectors to work correctly with OpenCL half vector types.
37 lines
1.4 KiB
Common Lisp
37 lines
1.4 KiB
Common Lisp
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
|
|
// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1100 %s -emit-llvm -o - | FileCheck %s
|
|
|
|
#pragma OPENCL EXTENSION cl_khr_fp16 : enable
|
|
|
|
typedef int int4 __attribute__((ext_vector_type(4)));
|
|
typedef float float4 __attribute__((ext_vector_type(4)));
|
|
typedef _Float16 float16_4 __attribute__((ext_vector_type(4)));
|
|
typedef half half4 __attribute__((ext_vector_type(4)));
|
|
|
|
// CHECK-LABEL: define dso_local noundef <4 x half> @test_assign_half4_to_float16_4(
|
|
// CHECK-SAME: <4 x half> noundef returned [[F16_4:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
|
|
// CHECK-NEXT: [[ENTRY:.*:]]
|
|
// CHECK-NEXT: ret <4 x half> [[F16_4]]
|
|
//
|
|
half4 test_assign_half4_to_float16_4(float16_4 f16_4) {
|
|
return f16_4;
|
|
}
|
|
|
|
// CHECK-LABEL: define dso_local noundef <4 x half> @test_assign_float16_4_to_half4(
|
|
// CHECK-SAME: <4 x half> noundef returned [[H4:%.*]]) local_unnamed_addr #[[ATTR0]] {
|
|
// CHECK-NEXT: [[ENTRY:.*:]]
|
|
// CHECK-NEXT: ret <4 x half> [[H4]]
|
|
//
|
|
float16_4 test_assign_float16_4_to_half4(half4 h4) {
|
|
return h4;
|
|
}
|
|
|
|
// CHECK-LABEL: define dso_local noundef <4 x half> @test_float16_4_to_half4(
|
|
// CHECK-SAME: <4 x half> noundef returned [[F16_4:%.*]]) local_unnamed_addr #[[ATTR0]] {
|
|
// CHECK-NEXT: [[ENTRY:.*:]]
|
|
// CHECK-NEXT: ret <4 x half> [[F16_4]]
|
|
//
|
|
half4 test_float16_4_to_half4(float16_4 f16_4) {
|
|
return (half4)f16_4;
|
|
}
|