The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.
Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.
- [clang] Remove support for the `-mcode-object-version=2` option.
- [lld] Remove/refactor tests that were still using COV2
- [llvm] Update AMDGPUUsage.rst
- Code Object V2 docs are left for informational purposes because those
code objects may still be supported by the runtime/loaders for a while.
- [AMDGPU] Remove COV2 emission capabilities.
- [AMDGPU] Remove `MetadataStreamerYamlV2` which was only used by COV2
- [AMDGPU] Update all tests that were still using COV2 - They are either
deleted or ported directly to code object v4 (as v3 is also planned to
be removed soon).
Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line.
In case the module flag is missing, we use the current default code object version supported in the compiler.
For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code
object version, That will be in a separate patch later.
For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file.
In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.
Reviewer: arsenm
Differential Revision:
https://reviews.llvm.org/D14313
This reverts commit ca907bfb57d8ad3ec3bcc2cff2abab7b1b933af6.
According to michel.daenzer,
> This completely broke the Mesa radeonsi driver on Navi 14. Xorg +
> xterm come up with major corruption & psychedelic colours.
When memory operations are outstanding on function calls, either the
caller or the callee can insert a waitcnt to ensure that all reads are
finished.
Calls need some time to be executed, so if the callee inserts the
waitcnt, filling the instruction buffer and waiting for memory will be
interleaved, hiding some latency. This comes at the cost of having a
waitcnt inside functions that may not be needed as no memory operations
are outstanding.
For function calls, this is already implemented. The same principal
applies to returns: If the caller inserts a waitcnt after the call, the
callee does not have to wait and the return and memory operation can be
run in parallel.
This commit implements waiting in the caller after returning from a
function call.
Differential Revision: https://reviews.llvm.org/D87674
We use both -long-option and --long-option in tests. Switch to --long-option for consistency.
In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf.
While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf).
llvm-svn: 359649
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.
llvm-svn: 303308
This is inserted directly in the text section. The relocation
for the function ends up resolving to the beginning of the
amd_kernel_code_t header rather than the actual function
entry point.
Also skip some of the comments for initialization
that only makes sense for kernels.
llvm-svn: 300736
This was reverted in r268740 because of problems with corresponding Clang change.
Clang change was updated and resubmitted in r274220.
Check calling convention in AMDGPUMachineFunction::isKernel
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.
Also, in the future unused non-kernels may be optimized.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19917
llvm-svn: 274341
Summary:
Check calling convention in AMDGPUMachineFunction::isKernel
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.
Also, in the future unused non-kernels may be optimized.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19917
llvm-svn: 268719