llvm-project

Author	SHA1	Message	Date
Fraser Cormack	d12a8da1de	[libclc] Move min/max/clamp into the CLC builtins library (#114386 ) These functions are "shared" between integer and floating-point types, hence the directory name. They are used in several CLC internal functions such as __clc_ldexp. Note that clspv and spirv targets don't want to define these functions, so pre-processor macros replace calls to __clc_min with regular min, for example. This means they can use as much of the generic CLC source files as possible, but where CLC functions would usually call out to an external __clc_min symbol, they call out to an external min symbol. Then they opt out of defining __clc_min itself in their CLC builtins library. Preprocessor definitions for these targets have also been changed somewhat: what used to be CLC_SPIRV (the 32-bit target) is now CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or CLC_SPIRV64. Same goes for CLC_CLSPV. There are no differences (measured with llvm-diff) in any of the final builtins libraries for nvptx, amdgpu, or clspv. Neither are there differences in the SPIR-V targets' LLVM IR before it's actually lowered to SPIR-V.	2024-10-31 16:45:37 +00:00
Fraser Cormack	86974e15f5	[libclc] Restore header order, which formatting broke	2024-10-31 10:33:47 +00:00
Fraser Cormack	fba9f05ff7	[libclc] Format clc_ldexp.cl and clc_hypot.cl. NFC	2024-10-31 10:18:29 +00:00
Romaric Jodin	7e6a73959a	libclc: increase fp16 support (#98149 ) Increase fp16 support to allow clspv to continue to be OpenCL compliant following the update of the OpenCL-CTS adding more testing on math functions and conversions with half. Math functions are implemented by upscaling to fp32 and using the fp32 implementation. It garantees the accuracy required for half-precision float-point by the CTS.	2024-07-18 12:00:41 +01:00
Youngsuk Kim	e60b83a645	[libclc] Clarify condition expression (NFC) Closes #91188	2024-05-14 08:51:56 -05:00
luolent	a98a6e95be	Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391 ) Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868) Parenthesis are added as requested on ternary operators with non trivial conditions. I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.	2024-05-04 18:38:45 +01:00
Romaric Jodin	9160f49e08	libclc: generic: add half implementation for erf/erfc (#66901 ) libclc does not have a half implementation for erf/erfc Add one based on the float implementation by extending the input and truncating the output.	2024-01-09 16:47:53 +00:00
Matt Arsenault	4ddba3a706	libclc: Add parentheses to silence warning Fixes #59209	2022-12-29 18:19:55 -05:00
Daniel Stone	291bfff5db	libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA Reviewer: jenatali jvesely Differential Revision: https://reviews.llvm.org/D85910	2020-09-16 01:37:22 -04:00
Boris Brezillon	3a7051d9c2	libclc: Fix FP_ILOGBNAN definition Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that implies fixing ilogb() implementation to return the right value. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed By: jvesely Differential Revision: https://reviews.llvm.org/D83473	2020-08-17 13:45:43 -07:00
Jan Vesely	efeafa1bda	libclc: Use acos implementation from amd_builtins Fixes acos CTS (1 thread, scalar) on AMD Turks. Reviewer: tstellar Differential Revision: https://reviews.llvm.org/D74011	2020-02-20 23:36:14 -05:00
Jan Vesely	4b23a2e8e9	libclc: Move rsqrt implementation to a .cl file Reviewer: awatry Differential Revision: https://reviews.llvm.org/D74013	2020-02-09 14:42:09 -05:00
Aaron Watry	64a8e1b83e	libclc/asin: Switch to amd builtins version of asin Fixes a wimpy-mode CTS failure for asin(float). Passes non-wimpy for both float/double on RX580. Signed-off-by: Aaron Watry <awatry@gmail.com> Tested-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2020-02-04 14:29:20 -05:00
Jan Vesely	5b136ca125	Move unary_instrinsic.inc to private headers. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356021	2019-03-13 07:06:19 +00:00
Jan Vesely	ee555aa992	trunc: Remove llvm intrinsic from the header. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356018	2019-03-13 07:06:10 +00:00
Jan Vesely	1c395b74bf	round: Remove llvm intrinsic from the header Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356017	2019-03-13 07:06:08 +00:00
Jan Vesely	b3d64e4a83	rint: Remove llvm intrinsic from the header. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356016	2019-03-13 07:06:06 +00:00
Jan Vesely	fd199f0139	floor: Remove llvm isntrinsic from the header. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356015	2019-03-13 07:06:03 +00:00
Jan Vesely	fda15e56a6	fabs: Remove llvm intrinsic from the header. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356014	2019-03-13 07:06:00 +00:00
Jan Vesely	54eb4d3a6d	ceil: Remove llvm intrinsic from the header. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356013	2019-03-13 07:05:58 +00:00
Jan Vesely	82c6c846af	sqrt: Split function generation to a shared inc file. This will be reused by other unary functions. Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 356012	2019-03-13 07:05:56 +00:00
Jan Vesely	6e85e6309d	math/fma: Add fp32 software implementation Passes CTS on carrizo (when forced to use sw fma) and turks. Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 334226	2018-06-07 20:27:43 +00:00
Jan Vesely	70a270da5f	Add initial support for half precision builtins v2: fix fmax implementation use consistent checks for __CLC_FP_SIZE add missing TODOs fix whitespace in definitions.h v3: undef ZERO in modf.inc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 332677	2018-05-17 22:55:30 +00:00
Jan Vesely	58fdb3b09a	rootn: Use denormal path only It's OK to either flush to 0 or return denormal result if the device does not support denormals. See sec 7.2 and 7.5.3 of OCL specs Use 0.0f explicitly intead of relying on GPU to flush it. Fixes CTS on carrizo and turks Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 332324	2018-05-15 04:22:43 +00:00
Jan Vesely	21e77037c0	remquo: Flush denormals if not supported It's OK to either flush to 0 or return denormal result if the device does not support denormals. See sec 7.2 and 7.5.3 of OCL specs. Fixes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 331435	2018-05-03 05:44:28 +00:00
Jan Vesely	8db45e4cf1	remquo: Port from amd builtins double version passes on carrizo. float version fails on denormals. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 331434	2018-05-03 05:44:26 +00:00
Jan Vesely	6146eda75d	math: Add helper function to flush denormals if not supported. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 331433	2018-05-03 05:44:22 +00:00
Jan Vesely	e7d567ee0d	log10: Use sw implementation from amd builtins Add missing table. Fixes log10d CTS on carrizo. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 330649	2018-04-23 21:10:42 +00:00
Jan Vesely	96591b6202	powr: Use denormal path only It's OK to either flush to 0 or return denormal result if the device does not support denormals. See sec 7.2 and 7.5.3 of OCL specs Fixes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 330207	2018-04-17 19:35:32 +00:00
Jan Vesely	4388d2883c	pown: Use denormal path only It's OK to either flush to 0 or return denormal result if the device does not support denormals. See sec 7.2 and 7.5.3 of OCL specs Fixes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 330206	2018-04-17 19:35:30 +00:00
Jan Vesely	0d92f3047f	pow: Use denormal path only It's OK to either flush to 0 or return denormal result if the device does not support denormals. See sec 7.2 and 7.5.3 of OCL specs Fixes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 330205	2018-04-17 19:35:28 +00:00
Jan Vesely	15c388cd79	exp10: Port from amd builtins Passes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed and Tested (on RX 580) by: Aaron Watry <awatry@gmail.com> llvm-svn: 330197	2018-04-17 18:08:08 +00:00
Jan Vesely	4be0339023	hypot: Port from amd builtins v2: Fix whitespace errors Use only subnormal path. Passes CTS on carrizo and turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 329647	2018-04-10 00:11:58 +00:00
Jan Vesely	93af966747	fmod: Port from amd_builtins Uses only denormal path for fp32. Passes CTS on carrizo and turks. v2: whitespace fix Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry <awatry@gmail.com> llvm-svn: 329433	2018-04-06 17:43:08 +00:00
Jan Vesely	5b10494fa8	remainder: Port from amd builtins Mostly ported from amd_builtins, uses only denormal path for fp32. Passes CTS on carrizo and turks Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327818	2018-03-19 01:01:10 +00:00
Jan Vesely	b672f7a251	nan: Implement Passes CTS on carrizo and turks Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327324	2018-03-12 19:46:52 +00:00
Jan Vesely	c15a48dd9c	lgamma_r: Move code from .inc to .cl file Reviewed-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 326821	2018-03-06 17:48:47 +00:00
Jan Vesely	2dcb382efc	frexp: Reuse types provided by gentype.inc v2: Use select instead of bitselect to consolidate scalar and vector versions Passes CTS on Carrizo Reviewed-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 326820	2018-03-06 17:48:45 +00:00
Jan Vesely	44f21978a2	minmag: Condition variable needs to be the same bitwidth as operands No changes wrt CTS Reviewed-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 326818	2018-03-06 17:48:40 +00:00
Jan Vesely	4e72300929	maxmag: Condition variable needs to be the same bitwidth as operands No changes wrt CTS Reviewed-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 326817	2018-03-06 17:48:38 +00:00
Jan Vesely	dbaf6d0f7c	Move cl_khr_fp64 exntension enablement to gentype include lists This will make adding cl_khr_fp16 support easier Reviewed-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 326816	2018-03-06 17:48:35 +00:00
Jan Vesely	3b8b4eb64d	half_powr: Implement using powr v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 323942	2018-02-01 03:00:35 +00:00
Jan Vesely	a75677c2b7	math.h: Use logical operations instead of bit operations for readability Trivial. Reported-by: Roman Lebedev <lebedev.ri@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 323920	2018-01-31 21:53:42 +00:00
Jan Vesely	0ecb5e511e	math.h: Set HAVE_HW_FMA32 based on compiler provided macro Fixes sin/cos piglits on non-FMA capable asics. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35983 Reviewer: Tom Stellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 323677	2018-01-29 19:05:08 +00:00
Jan Vesely	7013857f95	tanpi: Port from amd_builtins Passes piglit on turks and carrizo. Passes CTS on carrizo. Acked-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322980	2018-01-19 18:57:22 +00:00
Jan Vesely	03937bdec3	tan: Port from amd_builtins v2: fixup constant precision Passes piglit on turks and carrizo. Passes CTS on carrizo Fixes half_tan to pass CTS on carrizo Acked-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322979	2018-01-19 18:57:19 +00:00
Jan Vesely	44e0522c09	half_divide: Implement using x/y Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322899	2018-01-18 21:12:06 +00:00
Jan Vesely	2813b4f8d9	half_tan: Implement using tan v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322898	2018-01-18 21:12:04 +00:00
Jan Vesely	bf38fae8de	half_sin: Implement using sin Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322897	2018-01-18 21:12:01 +00:00
Jan Vesely	398108b91e	half_recip: Implement using 1/x Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322896	2018-01-18 21:11:58 +00:00

1 2 3 4

156 Commits