These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.
Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.
Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.
There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
Increase fp16 support to allow clspv to continue to be OpenCL compliant
following the update of the OpenCL-CTS adding more testing on math
functions and conversions with half.
Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.
Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and
guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that
implies fixing ilogb() implementation to return the right value.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed By: jvesely
Differential Revision: https://reviews.llvm.org/D83473
Fixes a wimpy-mode CTS failure for asin(float).
Passes non-wimpy for both float/double on RX580.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Passes CTS on carrizo (when forced to use sw fma) and turks.
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 334226
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Use 0.0f explicitly intead of relying on GPU to flush it.
Fixes CTS on carrizo and turks
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 332324
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs.
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331435
double version passes on carrizo. float version fails on denormals.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 331434
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 330207
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 330206
It's OK to either flush to 0 or return denormal result if the device
does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
Fixes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 330205
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed and Tested (on RX 580) by: Aaron Watry <awatry@gmail.com>
llvm-svn: 330197
v2: Fix whitespace errors
Use only subnormal path.
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 329647
Uses only denormal path for fp32.
Passes CTS on carrizo and turks.
v2: whitespace fix
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 329433
Mostly ported from amd_builtins, uses only denormal path for fp32.
Passes CTS on carrizo and turks
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327818
v2: Use select instead of bitselect to consolidate scalar and vector
versions
Passes CTS on Carrizo
Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326820
This will make adding cl_khr_fp16 support easier
Reviewed-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 326816
Passes CTS on carrizo
v2: Use full precision implementation
Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322899
Passes CTS on carrizo
v2: Use full precision implementation
Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322897
Passes CTS on carrizo
v2: Use full precision implementation
Reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 322896