llvm-project

Author	SHA1	Message	Date
Yaxun (Sam) Liu	053e61d54e	Relands "[HIP] Change default --gpu-max-threads-per-block value to 1024" This reverts commit e384e94fbe7c1d5c89fcdde33ffda04e9802c2ce.	2021-02-12 10:53:59 -05:00
Fangrui Song	fd739804e0	[test] Add {{.}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences For a default visibility external linkage definition, dso_local is set for ELF -fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences. To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition, which sets dso_local for default visibility external linkage definitions. To make this flip smooth and enable future (dso_local as definition default), this patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} `.	2020-12-31 00:27:11 -08:00
Yaxun (Sam) Liu	e384e94fbe	Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" This reverts commit 187658b8a6112446d9e7797d495bc7542ac83905 due to AMDGPU backend issues.	2020-10-15 17:25:55 -04:00
Yaxun (Sam) Liu	187658b8a6	Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024" Recommit 04abbb3a78186aa92809866b43217c32cba90b71	2020-09-28 22:43:17 -04:00
Yaxun (Sam) Liu	62dbb7e54c	Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" Temporarily revert commit 04abbb3a78186aa92809866b43217c32cba90b71 due to regressions in some HIP apps due backend issues revealed by this change. Will re-commit it when backend issues are fixed.	2020-09-02 16:12:28 -04:00
Yaxun (Sam) Liu	04abbb3a78	[HIP] Change default --gpu-max-threads-per-block value to 1024 Differential Revision: https://reviews.llvm.org/D76795	2020-06-03 11:09:22 -04:00
Yaxun Liu	1bea97c971	[AMDGPU] Set default flat work group size to (1,256) for HIP Differential Revision: https://reviews.llvm.org/D67048 llvm-svn: 370808	2019-09-03 18:50:24 +00:00
Yaxun Liu	4306f2086f	[CUDA] Set LLVM calling convention for CUDA kernel Some targets need special LLVM calling convention for CUDA kernel. This patch does that through a TargetCodeGenInfo hook. It only affects amdgcn target. Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45223 llvm-svn: 330447	2018-04-20 17:01:03 +00:00
Artem Belevich	55ebd6cc26	Revert "Set calling convention for CUDA kernel" This reverts r328795 which introduced an issue with referencing __global__ function templates. More details in the original review D44747. llvm-svn: 329099	2018-04-03 18:29:31 +00:00
Yaxun Liu	b2f2bb26e4	Set calling convention for CUDA kernel This patch sets target specific calling convention for CUDA kernels in IR. Patch by Greg Rodgers. Revised and lit test added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D44747 llvm-svn: 328795	2018-03-29 15:02:08 +00:00

10 Commits