This enables the AMDGPU specific implementation of `printf` when
compiling for AMDGCN flavoured SPIR-V, the consequence being that the
expansion into ROCDL calls & friends gets expanded before "lowering" to
SPIR-V and gets carried through. The only relatively "novel" aspect is
that the `callAppendStringN` is simplified to take the type of the
passed in arguments, as opposed to querying them from the module. This
is a neutral change since the arguments were passed directly to the
call, without any attempt to cast them, hence the assumption that the
actual types match the formal ones was already baked in.
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
This is an alternative to currently existing hostcall implementation and uses printf buffer similar to OpenCL,
The data stored in the buffer (i.e the data frame) for each printf call are as follows,
1. Control DWord - contains info regarding stream, format string constness and size of data frame
2. Hash of the format string (if constant) else the format string itself
3. Printf arguments (each aligned to 8 byte boundary)
The format string Hash is generated using LLVM's MD5 Message-Digest Algorithm implementation and only low 64 bits are used.
The implementation still uses amdhsa metadata and hash is stored as part of format string itself to ensure
minimal changes in runtime.
Differential Revision: https://reviews.llvm.org/D150427
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.
If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.
The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.
Reviewed By: jdoerfert, arsenm, kpyzhov
Differential Revision: https://reviews.llvm.org/D119216
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
The current way to detect hostcalls by looking for "ockl_hostcall_internal()" function in the module seems to be not reliable enough. The LTO may rename the "ockl_hostcall_internal()" function when an application is compiled with "-fgpu-rdc", and MetadataStreamer pass to fail to detect hostcalls, therefore it does not set the "hidden_hostcall_buffer" kernel argument.
This change adds a new module flag: hostcall that can be used to detect whether GPU functions use host calls for printf.
Differential revision: https://reviews.llvm.org/D110337
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
program for the AMDGPU target.
The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D71365