108 Commits

Author SHA1 Message Date
Changpeng Fang
7ca3444fba AMDGPU: Use module flag to get code object version at IR level folow-up
Summary:
  This is part of the leftover work for https://reviews.llvm.org/D143138.
In this work, we pass code object version as an argument to initialize target ID
and use it for targetID dump.

Reviewers: arsenm

Differential Revision
  https://reviews.llvm.org/D143293
2023-02-10 11:16:38 -08:00
Archibald Elliott
8e3d7cf5de [NFC][TargetParser] Remove llvm/Support/TargetParser.h 2023-02-07 11:08:21 +00:00
Changpeng Fang
54cf69c9d5 AMDGPU: Use module flag to get code object version at IR level
Summary:
  This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line.
In case the module flag is missing, we use the current default code object version supported in the compiler.

For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code
object version, That will be in a separate patch later.

For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file.
In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.

Reviewer: arsenm

Differential Revision:
  https://reviews.llvm.org/D14313
2023-02-02 18:57:26 -08:00
Fangrui Song
67819a72c6 [CodeGen] llvm::Optional => std::optional 2022-12-13 09:06:36 +00:00
Guillaume Chatelet
070bc71830 [Alignment][NFC] Use Align in MCSymbol::declareCommon 2022-12-12 14:29:04 +00:00
Guillaume Chatelet
6c09ea3fdd [Alignment][NFC] Use Align in MCStreamer::emitValueToAlignment
Differential Revision: https://reviews.llvm.org/D138674
2022-11-24 16:09:44 +00:00
Abinav Puthan Purayil
3d9f011a9c [AMDGPU] Make the uses_dynamic_stack field in the kernel descriptor and the metadata map specific to code object v5 and later
Unfortunately, we have a broken handling of this in the runtime of rocm
5.3. The runtime is expected to handle this correctly when v5 becomes
the default.

Differential Revision: https://reviews.llvm.org/D134714
2022-10-11 23:28:43 +05:30
Abinav Puthan Purayil
d96361d714 [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map
This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.

Differential Revision: https://reviews.llvm.org/D128344
2022-07-18 10:07:13 +05:30
Jay Foad
d7762a3b36 [AMDGPU] Increase instruction cache line size to 128 bytes for GFX11
Differential Revision: https://reviews.llvm.org/D128189
2022-06-20 14:25:10 +01:00
Fangrui Song
adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Joe Nash
ea3c9a87d3 [AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3
Contributors:
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 21/N for upstreaming of AMDGPU gfx11 architecture

Depends on D127143

Reviewed By: rampitec, #amdgpu, kzhuravl

Differential Revision: https://reviews.llvm.org/D127241
2022-06-10 13:07:14 -04:00
Fangrui Song
15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce01b8485ba0ef486654bc9393c4ac024 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Joe Nash
813e521e55 [AMDGPU] Add gfx11 subtarget ELF definition
This is the first patch of a series to upstream support for the new
subtarget.

Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 1/N for upstreaming AMDGPU gfx11 architectures.

Reviewed By: foad, kzhuravl, #amdgpu

Differential Revision: https://reviews.llvm.org/D124536
2022-04-29 12:27:17 -04:00
Jacob Lambert
5160447f58 [AMDGPU] Add gfx10 assembler directive to specify shared VGPR count
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D105507
2022-03-07 14:27:41 -08:00
Aakanksha
840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin
2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Sebastian Neubauer
6527b2a4d5 [AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235
2022-02-18 15:05:21 +01:00
serge-sans-paille
ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Changpeng Fang
1194b9cdda AMDGPU {NFC}: Add code object v5 support and generate metadata for implicit kernel args
Summary:
  Add code object v5 support (deafult is still v4)
  Generate metadata for implicit kernel args for the new ABI
  Set the metadata version to be 1.2

Reviewers:
  t-tye, b-sumner, arsenm, and bcahoon

Fixes:
  SWDEV-307188, SWDEV-307189

Differential Revision:
  https://reviews.llvm.org/D118272
2022-01-31 18:07:47 -08:00
Matt Arsenault
e6564f39c7 AMDGPU: Emit user sgpr count directives in text asm
We were emitting these in the object file but not printing them.
2022-01-26 13:51:12 -05:00
Aakanksha Patil
3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon
294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon
211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon
ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Stanislav Mekhanoshin
6fb02596a2 [AMDGPU] Add support for architected flat scratch
Add support for the readonly flat Scratch register initialized
by the SPI.

Differential Revision: https://reviews.llvm.org/D102432
2021-05-14 10:53:48 -07:00
Aakanksha Patil
464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Konstantin Zhuravlyov
f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Jay Foad
288ea820cf [AMDGPU] Refactor AMDGPUTargetStreamer::EmitCodeEnd
Refactor and add comments to explain where the magic numbers come from
in terms of the instruction cache line size. NFC.

Differential Revision: https://reviews.llvm.org/D98266
2021-03-09 19:02:18 +00:00
Stanislav Mekhanoshin
a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Simon Pilgrim
f82cff31d3 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
dfukalov
6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Tony
d5ea8f7010 [AMDGPU] Clarify scratch initialization
- Clarify documentation on initializing scratch.
- Rename compute_pgm_rsrc2 field for enabling scratch from
  ENABLE_SGPR_PRIVATE_SEGMENT_WAVEFRONT_OFFSET to
  ENABLE_PRIVATE_SEGMENT to match hardware definition.

Differential Revision: https://reviews.llvm.org/D93271
2020-12-15 20:14:20 +00:00
Tim Renouf
89d41f3a2b [AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447

Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03 16:27:48 +00:00
Tim Renouf
ee3e642627 [AMDGPU] Add gfx90c target
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.

Differential Revision: https://reviews.llvm.org/D90419

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-11-03 16:27:43 +00:00
Stanislav Mekhanoshin
d1beb95d12 [AMDGPU] gfx1032 target
Differential Revision: https://reviews.llvm.org/D89487
2020-10-15 12:41:18 -07:00
Tim Renouf
666ef0db20 [AMDGPU] Add gfx602, gfx705, gfx805 targets
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:

* The "Oland" and "Hainan" variants of gfx601 are now split out into
  gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
  that to avoid using the shaderZExport workaround on gfx602.

* One variant of gfx703 is now split out into gfx705. LLPC and other
  front-ends could use that to avoid using the
  shaderSpiCsRegAllocFragmentation workaround on gfx705.

* The "TongaPro" variant of gfx802 is now split out into gfx805.
  TongaPro has a faster 64-bit shift than its former friends in gfx802,
  and a subtarget feature could be set up for that to take advantage of
  it. This commit does not make that change; it just adds the target.

V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
    so fix the GPUKind order.

Differential Revision: https://reviews.llvm.org/D88916

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-10 17:22:22 +01:00
Scott Linder
e4a9e4ef55 [AMDGPU] Emit correct kernel descriptor on big-endian hosts
Previously we wrote multi-byte values out as-is from host memory. Use
the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor
irrespective of the host endianness.

Reviewed By: arsenm, rochauha

Differential Revision: https://reviews.llvm.org/D88858
2020-10-06 17:29:38 +00:00
Steven Perron
eed6476a87 Reset PAL metadata when AMDGPU traget stream finishes
If the same stream object is used for multiple compiles, the PAL metadata from eariler compilations will leak into later one.  See https://github.com/GPUOpen-Drivers/llpc/issues/882 for how this is happening in LLPC.

No tests were added because multiple compiles will have to happen using the same pass manager, and I do not see a setup for that on the LLVM side.  Let me know if there is a good way to test this.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D85667
2020-08-17 10:56:11 -04:00
Stanislav Mekhanoshin
ea7d0e2996 [AMDGPU] gfx1031 target
Differential Revision: https://reviews.llvm.org/D85337
2020-08-05 12:36:26 -07:00
Guillaume Chatelet
52911428ef [Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82743
2020-06-29 11:56:06 +00:00
Stanislav Mekhanoshin
9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Sebastian Neubauer
1de4e56933 [AMDGPU] Don't mark the .note section as ALLOC
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.

On AMDHSA, .note is still marked as ALLOC, apparently this is currently
needed for OpenCL (see https://reviews.llvm.org/D74995).

Differential Revision: https://reviews.llvm.org/D76278
2020-05-05 14:21:45 +02:00
Fangrui Song
692e0c9648 [MC] Add MCStreamer::emitInt{8,16,32,64}
Similar to AsmPrinter::emitInt{8,16,32,64}.
2020-02-29 09:40:21 -08:00
Mark Searles
d3e170c438 Revert "[AMDGPU] Don’t marke the .note section as ALLOC"
This reverts commit 977cd661cf019039dec7ffdd15bf0ac500828c87.

It breaks OpenCL testing. OpenCL Runtime is using PT_LOAD information
to calculate memory for global variables. This commit should be relanded once
the OpenCL runtime stops relying on PT_LOAD information for calculating global
variable memory size.

Differential Revision: https://reviews.llvm.org/D74995
2020-02-21 16:08:30 -08:00
Sebastian Neubauer
977cd661cf [AMDGPU] Don’t marke the .note section as ALLOC
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.

Differential Revision: https://reviews.llvm.org/D74600
2020-02-20 15:14:48 +01:00
Fangrui Song
774971030d [MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex} 2020-02-14 23:08:40 -08:00
Fangrui Song
6d2d589b06 [MC] De-capitalize another set of MCStreamer::Emit* functions
Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc
2020-02-14 19:26:52 -08:00
Fangrui Song
a55daa1461 [MC] De-capitalize some MCStreamer::Emit* functions 2020-02-14 19:11:53 -08:00
Benjamin Kramer
adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Simon Pilgrim
ea2c159f96 [AMDGPU] Fix "use of uninitialized variable" static analyzer warning. NFCI.
Add "unreachable" default case to AMDGPUTargetStreamer::getArchNameFromElfMach
2020-01-06 16:36:56 +00:00