This commit bulk updates all '.h', '.cl', '.inc', and '.cpp' files to
add any missing license headers.
The remaining files are generally CMake, SOURCES, scripts, markdown,
etc.
There are still some '.ll' files which may benefit from a license
header. I can't find an example of an LLVM IR file with a license header
in the rest of LLVM, but unlike most other (sub)projects, libclc has
examples of LLVM IR as source files, compiled and built into the
library.
Several users of (mostly math/) gentype.inc rely on types other than the
'gentype'. This is commonly intN as several maths builtins expose this
as a return or paramter type. We were previously explicitly defining
this type for every gentype.
Other implementations rely on integer types of the same size and element
width as the gentype, such as short/ushort for half, long/ulong for
double, etc.
Users might also rely on as_type or convert_type builtins to/from these
types.
The previous method we used to define intN was unscalable if we wanted
to expose more types and helpers.
This commit introduces a simpler system whereby several macros are
defined at the beginning of gentype.inc. These rely on concatenating
with the vector size. To facilitate this system, scalar gentypes now
define an empty vector size. It was previously undefined, which was
dangerous. An added benefit is that it matches how the integer
gentype.inc vector size has been working.
These macros will be especially helpful for the definitions of
logb/ilogb in an upcoming patch.
This commit is a broad update across libclc to use the CLC conversion
builtins in CLC functions, even those with a '__clc' prefix in the
generic folder. This better prepares them for an official move to the
CLC library in time.
The CLC conversion builtins have an additional benefit in that they
support scalars, unlike the __builtin_convertvector builtin which we
were using previously. This allows us to simplify some shared
definitions.
There is one change to the IR, in the scalar upsample(char, uchar)
builtin. It now sign-extends the first argument to i16, where before it
zero-extended it. This appears to be correct, and matches the vector
behaviour.
This commit moves the rotate builtin to the CLC library.
It also optimizes rotate(x, n) to generate the @llvm.fshl(x, x, n)
intrinsic, for both scalar and vector types. The previous implementation
was too cautious in its handling of the shift amount; the OpenCL rules
state that the shift amount is always treated as an unsigned value
modulo the bitwidth.
This commit moves the mad_sat builtin to the CLC library.
It also optimizes it for vector types by avoiding scalarization. To help
do this it transforms the previous control-flow code into vector select
code. This has also been done for the scalar versions for simplicity.
This commit moves over the OpenCL clz, hadd, mad24, mad_hi, mul24,
mul_hi, popcount, rhadd, and upsample builtins to the CLC library.
This commit also optimizes the vector forms of the mul_hi and upsample
builtins to consistently remain in vector types, instead of recursively
splitting vectors down to the scalar form.
The OpenCL mad_hi builtin wasn't previously publicly available from the
CLC libraries, as it was hash-defined to mul_hi in the header files.
That issue has been fixed, and mad_hi is now exposed.
The custom AMD implementation/workaround for popcount has been removed
as it was only required for clang < 7.
There are still two integer functions which haven't been moved over. The
OpenCL mad_sat builtin uses many of the other integer builtins, and
would benefit from optimization for vector types. That can take place in
a follow-up commit. The rotate builtin could similarly use some more
dedicated focus, potentially using clang builtins.
Using the `__builtin_elementwise_(add|sub)_sat` functions allows us to
directly optimize to the desired intrinsic, and avoid scalarization for
vector types.