llvm-project

Author	SHA1	Message	Date
Jay Foad	044b8f4bc8	[AMDGPU] Restore documentation of .amdhsa_shared_vgpr_count This was accidentally lost in D127402.	2022-06-10 17:06:08 +01:00
Jay Foad	b0a3849439	[AMDGPU] Update dlc usage for GFX11 In GFX10 dlc controlled L1 cache bypass. In GFX11 it has been repurposed to control MALL NOALLOC, and glc controls L1 as well as L0 cache bypass. Update the documentation and SIMemoryLegalizer accordingly. Set dlc for nontemporal and volatile accesses. Differential Revision: https://reviews.llvm.org/D127405	2022-06-10 08:10:34 +01:00
Tony	802e3f4f57	[AMDGPU] Add GFX11 documentation to AMDGPUUsage Update most of the document to include GFX11. Memory model changes will come later. Differential Revision: https://reviews.llvm.org/D127402	2022-06-10 08:10:34 +01:00
Dmitry Preobrazhensky	62c46093f1	[AMDGPU][DOC][NFC] Add GFX90C and GFX940 assembler syntax description	2022-05-31 14:29:06 +03:00
Brian Tracy	87a55137e2	Fix "the the" typo in documentation and user facing strings There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users. Reviewed By: #libc, Mordante, martong, ldionne Differential Revision: https://reviews.llvm.org/D124708	2022-05-05 17:52:08 +02:00
Joe Nash	ec6d1a0278	Fix sphinx build error in AMDGPUUsage.rst Corrects error from 813e521e55b11165138b071f446eda94b14570dc	2022-04-29 13:32:06 -04:00
Joe Nash	813e521e55	[AMDGPU] Add gfx11 subtarget ELF definition This is the first patch of a series to upstream support for the new subtarget. Contributors: Jay Foad <jay.foad@amd.com> Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> Patch 1/N for upstreaming AMDGPU gfx11 architectures. Reviewed By: foad, kzhuravl, #amdgpu Differential Revision: https://reviews.llvm.org/D124536	2022-04-29 12:27:17 -04:00
Changpeng Fang	8edaf25986	AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally Summary: Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default. We use implicitarg_ptr + offset to check whether the multigrid synchronization pointer is used. If yes, we remove this attribute and also remove amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function. Reviewers: arsenm, sameerds, b-sumner and foad Differential Revision: https://reviews.llvm.org/D123548	2022-04-12 12:36:30 -07:00
Scott Linder	09f33a430b	[AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic The diagnostic is unreliable, and triggers even for dead uses of hostcall that may exist when linking the device-libs at lower optimization levels. Eliminate the diagnostic, and directly document the limitation for OpenCL before code object V5. Make some NFC changes to clarify the related code in the MetadataStreamer. Add a clang test to tie OCL sources containing printf to the backend IR tests for this situation. Reviewed By: sameerds, arsenm, yaxunl Differential Revision: https://reviews.llvm.org/D121951	2022-04-05 19:10:23 +00:00
Dmitry Preobrazhensky	111cb395c9	[AMDGPU][DOC][NFC] Added GFX1013 assembler syntax description	2022-04-01 14:47:38 +03:00
Dmitry Preobrazhensky	5975f1c5f9	[AMDGPU][DOC][NFC] Added GFX1030 assembler syntax description	2022-03-25 18:14:04 +03:00
Dmitry Preobrazhensky	53491e4519	[AMDGPU][DOC][NFC] Added links to MI200 documentation Differential Revision: https://reviews.llvm.org/D121811	2022-03-18 13:17:42 +03:00
Stanislav Mekhanoshin	3a37d08b35	[AMDGPU] Correct gfx940 memory model documentation. Differential Revision: https://reviews.llvm.org/D121397	2022-03-16 11:59:40 -07:00
Stanislav Mekhanoshin	47bac63d3f	[AMDGPU] gfx940 memory model Differential Revision: https://reviews.llvm.org/D121242	2022-03-14 15:01:46 -07:00
Jacob Lambert	5160447f58	[AMDGPU] Add gfx10 assembler directive to specify shared VGPR count Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D105507	2022-03-07 14:27:41 -08:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin	2e2e64df4a	[AMDGPU] Add gfx940 target This is target definition only. Differential Revision: https://reviews.llvm.org/D120688	2022-03-02 13:54:48 -08:00
Changpeng Fang	ca62b1db9f	[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg Summary: Emit metadata for hidden_heap_v1 kernarg Reviewers: sameerds, b-sumner Fixes: SWDEV-307188 Differential Revision: https://reviews.llvm.org/D119027	2022-02-25 10:45:35 -08:00
Jacob Lambert	7470244475	[AMDGPU] Add agpr_count to metadata and AsmParser gfx90a allows the number of ACC registers (AGPRs) to be set independently to the VGPR registers. For both HSA and PAL metadata, we now include an "agpr_count" key to report the number of AGPRs set for supported devices (gfx90a, gfx908, as determined by hasMAIInsts()). This is collected from SIProgramInfo.NumAccVGPR for both HSA and PAL. The AsmParser also now recognizes ".kernel.agpr_count" for supported devices. Differential Revision: https://reviews.llvm.org/D116140	2022-02-16 15:17:23 -08:00
Lancelot Six	046017291f	[AMDGPU][NFC] AMDGPUUsage.rst: fix wording.	2022-02-07 20:06:17 -05:00
Changpeng Fang	022c8d4a3f	AMDGPU [NFC]: Fix a few typos in docs AMDGPUUsage.rst Summery: Fix a few typos in docs AMDGPUUsage.rst Differential Revision: https://reviews.llvm.org/D118272	2022-02-02 14:22:52 -08:00
Changpeng Fang	1194b9cdda	AMDGPU {NFC}: Add code object v5 support and generate metadata for implicit kernel args Summary: Add code object v5 support (deafult is still v4) Generate metadata for implicit kernel args for the new ABI Set the metadata version to be 1.2 Reviewers: t-tye, b-sumner, arsenm, and bcahoon Fixes: SWDEV-307188, SWDEV-307189 Differential Revision: https://reviews.llvm.org/D118272	2022-01-31 18:07:47 -08:00
Matt Arsenault	e6564f39c7	AMDGPU: Emit user sgpr count directives in text asm We were emitting these in the object file but not printing them.	2022-01-26 13:51:12 -05:00
Changpeng Fang	4cfea311cb	[AMDGPU][NFC] Update to AMDGPUUsage for default Code Object Version Summary: Update the documentation for default code object version (from v3 to v4). Reviewers: kzhuravl Differential Revision: https://reviews.llvm.org/D117845	2022-01-24 14:33:12 -08:00
Tony Tye	0ac939f3e2	[AMDGPU][NFC] Update to DWARF extension for heterogeneous debugging - Update documentation on the DWARF extension for heterogeneous debugging to better reference the DWARF Version 5 standard. - Numerous other corrections. Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D116275	2021-12-28 17:13:45 +00:00
Tony Tye	c6be2ad73a	[AMDGPU][NFC] Add documentation for location description DWARF extension Add documentation for the DWARF extension to allow location descriptions on the DWARF expression stack. This is part of the "DWARF Extensions For Heterogeneous Debugging" used by the AMD GPU target. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D115587	2021-12-14 00:58:17 +00:00
Jay Foad	5d602120c3	[AMDGPU] Update docs for nontemporal store Update the documented GFX10 code sequence for nontemporal stores after D114351. Differential Revision: https://reviews.llvm.org/D114707	2021-11-30 09:43:42 +00:00
Jay Foad	65d9dc7f1f	[AMDGPU] Fix list indentation in docs	2021-11-29 15:06:01 +00:00
Jay Foad	7319d11586	[AMDGPU] Fix "must generated" typo in docs	2021-11-29 15:01:18 +00:00
Carl Ritson	6d28dffb6b	[AMDGPU] Update GFX10 memory model to account for MALL Document memory attached last level (MALL) cache added in GFX10.3. Reviewed By: t-tye Differential Revision: https://reviews.llvm.org/D114076	2021-11-18 09:29:30 +09:00
Matt Arsenault	8d4b74ac3f	AMDGPU: Don't consider whether amdgpu-flat-work-group-size was set It should be semantically identical if it was set to the same value as the default. Also improve the documentation.	2021-10-22 16:23:50 -04:00
Scott Linder	0022426917	[AMDGPU] Update Call Convention docs for GFX90A Document the CSR AGPRs for GFX90A. Remove the TODO for gfx908, as the answer is that we don't mark any AGPRs as callee-saved except for GFX90A, i.e. the docs as-is are correct for gfx908. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D109009	2021-09-01 20:02:41 +00:00
Kazu Hirata	5294a0f7c3	[llvm] Fix typos in documentation (NFC)	2021-08-28 06:37:03 -07:00
Matt Arsenault	088cc63640	AMDGPU: Invert AMDGPUAttributor Switch to using BitIntegerState for each of the inputs, and invert their meanings. This now diverges more from the old AMDGPUAnnotateKernelFeatures, but this isn't used yet anyway.	2021-08-26 21:32:13 -04:00
RamNalamothu	9b9e7f6f4e	[docs, AMDGPU] Fix typo in dwarf register number mapping Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D108557	2021-08-26 23:55:29 +05:30
Reshabh Sharma	5173854f19	[AMDGPU] Handle functions in llvm's global ctors and dtors list This patch introduces a new code object metadata field, ".kind" which is used to add support for init and fini kernels. HSAStreamer will use function attributes, "device-init" and "device-fini" to distinguish between init and fini kernels from the regular kernels and will emit metadata with ".kind" set to "init" and "fini" respectively. To reduce the number of init and fini kernels, the ctors and dtors present in the llvm's global.ctors and global.dtors lists are called from a single init and fini kernel respectively. Reviewed by: yaxunl Differential Revision: https://reviews.llvm.org/D105682	2021-08-06 15:53:33 +05:30
Reshabh Sharma	dce35ef104	Revert "[AMDGPU] Handle functions in llvm's global ctors and dtors list" This reverts commit d42e70b3d315645e37f3b1455d39e68678e69525.	2021-08-04 23:33:31 +05:30
Reshabh Sharma	d42e70b3d3	[AMDGPU] Handle functions in llvm's global ctors and dtors list This patch introduces a new code object metadata field, ".kind" which is used to add support for init and fini kernels. HSAStreamer will use function attributes, "device-init" and "device-fini" to distinguish between init and fini kernels from the regular kernels and will emit metadata with ".kind" set to "init" and "fini" respectively. To reduce the number of init and fini kernels, the ctors and dtors present in the llvm's global.ctors and global.dtors lists are called from a single init and fini kernel respectively. Reviewed by: yaxunl Differential Revision: https://reviews.llvm.org/D105682	2021-08-04 19:53:33 +05:30
Tony Tye	51e62e56f7	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x45 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106249	2021-07-19 20:17:35 +00:00
Tony Tye	53fed88159	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106034	2021-07-15 06:46:27 +00:00
Hafiz Abid Qadeer	b205f2bb89	[AMDGPU] Handle s_branch to another section. Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label. Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned. This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error. Jumps to other section can arise with hold/cold splitting. A patch to handle the relocation in lld will follow shortly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D105760	2021-07-13 12:17:47 +01:00
Krzysztof Drewniak	8ba53152d7	Add newline to fix documentation build Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D105825	2021-07-12 19:00:58 +00:00
Krzysztof Drewniak	bef5ed1eea	[AMDGPU][Docs] Update Code Object V3 example to includes args section The documentation for the AMDGPU assembler's examples don't show the .args section, which, if ommitted, will cause arguments to silently not be passed into the kernel. This commit fixes this issue. Reviewed By: #amdgpu, scott.linder Differential Revision: https://reviews.llvm.org/D105222	2021-07-09 17:42:29 +00:00
Tony Tye	8d69635ed9	[NFC][AMDGPU] Add link to AMD GPU gfx906 instruction set architecture Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D105377	2021-07-06 20:21:26 +00:00
Sebastian Neubauer	db646de3ee	[AMDGPU] Set optional PAL metadata Set informational fields in the .shader_functions table. Also correct the documentation, .scratch_memory_size and .lds_size are integers. Differential Revision: https://reviews.llvm.org/D105116	2021-07-06 11:58:00 +02:00
Tony Tye	7f19aa73c2	[AMDGPU] Update gfx90a memory model support Update AMDGPU gfx90a memory model to make coarse grain memory allocations consistent when fine grained system scope atomic acquire and release is performed. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D105137	2021-06-30 04:05:22 +00:00
Tony Tye	a1526af464	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x43 Reviewed By: kzhuravl, rampitec Differential Revision: https://reviews.llvm.org/D104872	2021-06-24 22:51:47 +00:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00

1 2 3 4 5

232 Commits