10 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu
053e61d54e Relands "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit e384e94fbe7c1d5c89fcdde33ffda04e9802c2ce.
2021-02-12 10:53:59 -05:00
Fangrui Song
fd739804e0 [test] Add {{.*}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences
For a default visibility external linkage definition, dso_local is set for ELF
-fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar
to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences.

To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition,
which sets dso_local for default visibility external linkage definitions.

To make this flip smooth and enable future (dso_local as definition default),
this patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `.
2020-12-31 00:27:11 -08:00
Yaxun (Sam) Liu
e384e94fbe Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit 187658b8a6112446d9e7797d495bc7542ac83905 due to
AMDGPU backend issues.
2020-10-15 17:25:55 -04:00
Yaxun (Sam) Liu
187658b8a6 Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Recommit 04abbb3a78186aa92809866b43217c32cba90b71
2020-09-28 22:43:17 -04:00
Yaxun (Sam) Liu
62dbb7e54c Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Temporarily revert commit 04abbb3a78186aa92809866b43217c32cba90b71
due to regressions in some HIP apps due backend issues revealed by
this change.

Will re-commit it when backend issues are fixed.
2020-09-02 16:12:28 -04:00
Yaxun (Sam) Liu
04abbb3a78 [HIP] Change default --gpu-max-threads-per-block value to 1024
Differential Revision: https://reviews.llvm.org/D76795
2020-06-03 11:09:22 -04:00
Yaxun Liu
1bea97c971 [AMDGPU] Set default flat work group size to (1,256) for HIP
Differential Revision: https://reviews.llvm.org/D67048

llvm-svn: 370808
2019-09-03 18:50:24 +00:00
Yaxun Liu
4306f2086f [CUDA] Set LLVM calling convention for CUDA kernel
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.

It only affects amdgcn target.

Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D45223

llvm-svn: 330447
2018-04-20 17:01:03 +00:00
Artem Belevich
55ebd6cc26 Revert "Set calling convention for CUDA kernel"
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.

llvm-svn: 329099
2018-04-03 18:29:31 +00:00
Yaxun Liu
b2f2bb26e4 Set calling convention for CUDA kernel
This patch sets target specific calling convention for CUDA kernels in IR.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44747

llvm-svn: 328795
2018-03-29 15:02:08 +00:00