4 Commits

Author SHA1 Message Date
Jun Wang
c4e517f59c
[AMDGPU] Adding the amdgpu_num_work_groups function attribute (#79035)
A new function attribute named amdgpu_num_work_groups is added. This
attribute, which consists of three integers, allows programmers to let
the compiler know the number of workgroups to be launched in each of the
three dimensions and do optimizations based on that information.

---------

Co-authored-by: Jun Wang <jun.wang7@amd.com>
2024-03-12 10:30:39 -07:00
Michael Liao
7557afa000 [AMDGPU] Allow using integral non-type template parameters
Summary:
- Allow using integral non-type template parameters in the following
  attributes

  __attribute__((amdgpu_flat_work_group_size(<min>, <max>)))
  __attribute__((amdgpu_waves_per_eu(<min>[, <max>])))

Reviewers: kzhuravl, yaxunl

Subscribers: jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D58623

llvm-svn: 354909
2019-02-26 18:49:36 +00:00
Yaxun Liu
aa24601f98 [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes
There are HIP applications e.g. Tensorflow 1.3 using amdgpu kernel attributes, however
currently they are only allowed on OpenCL kernel functions.

This patch will allow amdgpu kernel attributes to be applied to CUDA/HIP __global__
functions.

Differential Revision: https://reviews.llvm.org/D47958

llvm-svn: 334561
2018-06-12 23:58:59 +00:00
Konstantin Zhuravlyov
5b48d725a0 [AMDGPU] Expose flat work group size, register and wave control attributes
__attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size
__attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit

Differential Revision: https://reviews.llvm.org/D24513

llvm-svn: 282371
2016-09-26 01:02:57 +00:00