Summary:
This is part of the leftover work for https://reviews.llvm.org/D143138.
In this work, we pass code object version as an argument to initialize target ID
and use it for targetID dump.
Reviewers: arsenm
Differential Revision
https://reviews.llvm.org/D143293
Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line.
In case the module flag is missing, we use the current default code object version supported in the compiler.
For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code
object version, That will be in a separate patch later.
For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file.
In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.
Reviewer: arsenm
Differential Revision:
https://reviews.llvm.org/D14313
Unfortunately, we have a broken handling of this in the runtime of rocm
5.3. The runtime is expected to handle this correctly when v5 becomes
the default.
Differential Revision: https://reviews.llvm.org/D134714
This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.
Differential Revision: https://reviews.llvm.org/D128344
This is the first patch of a series to upstream support for the new
subtarget.
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
Patch 1/N for upstreaming AMDGPU gfx11 architectures.
Reviewed By: foad, kzhuravl, #amdgpu
Differential Revision: https://reviews.llvm.org/D124536
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
Summary:
Add code object v5 support (deafult is still v4)
Generate metadata for implicit kernel args for the new ABI
Set the metadata version to be 1.2
Reviewers:
t-tye, b-sumner, arsenm, and bcahoon
Fixes:
SWDEV-307188, SWDEV-307189
Differential Revision:
https://reviews.llvm.org/D118272
Refactor and add comments to explain where the magic numbers come from
in terms of the instruction cache line size. NFC.
Differential Revision: https://reviews.llvm.org/D98266
- Clarify documentation on initializing scratch.
- Rename compute_pgm_rsrc2 field for enabling scratch from
ENABLE_SGPR_PRIVATE_SEGMENT_WAVEFRONT_OFFSET to
ENABLE_PRIVATE_SEGMENT to match hardware definition.
Differential Revision: https://reviews.llvm.org/D93271
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:
* The "Oland" and "Hainan" variants of gfx601 are now split out into
gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
that to avoid using the shaderZExport workaround on gfx602.
* One variant of gfx703 is now split out into gfx705. LLPC and other
front-ends could use that to avoid using the
shaderSpiCsRegAllocFragmentation workaround on gfx705.
* The "TongaPro" variant of gfx802 is now split out into gfx805.
TongaPro has a faster 64-bit shift than its former friends in gfx802,
and a subtarget feature could be set up for that to take advantage of
it. This commit does not make that change; it just adds the target.
V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
so fix the GPUKind order.
Differential Revision: https://reviews.llvm.org/D88916
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
Previously we wrote multi-byte values out as-is from host memory. Use
the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor
irrespective of the host endianness.
Reviewed By: arsenm, rochauha
Differential Revision: https://reviews.llvm.org/D88858
If the same stream object is used for multiple compiles, the PAL metadata from eariler compilations will leak into later one. See https://github.com/GPUOpen-Drivers/llpc/issues/882 for how this is happening in LLPC.
No tests were added because multiple compiles will have to happen using the same pass manager, and I do not see a setup for that on the LLVM side. Let me know if there is a good way to test this.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D85667
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.
On AMDHSA, .note is still marked as ALLOC, apparently this is currently
needed for OpenCL (see https://reviews.llvm.org/D74995).
Differential Revision: https://reviews.llvm.org/D76278
This reverts commit 977cd661cf019039dec7ffdd15bf0ac500828c87.
It breaks OpenCL testing. OpenCL Runtime is using PT_LOAD information
to calculate memory for global variables. This commit should be relanded once
the OpenCL runtime stops relying on PT_LOAD information for calculating global
variable memory size.
Differential Revision: https://reviews.llvm.org/D74995
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.
Differential Revision: https://reviews.llvm.org/D74600
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.