llvm-project

Author	SHA1	Message	Date
Jan Vesely	a95db14461	half_rsqrt: Cleanup implementation Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322887	2018-01-18 21:11:35 +00:00
Jan Vesely	fe8e00bc3c	rootn: Port from amd_builtins Passes piglit on turks and carrizo fp64 passes ctx on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322763	2018-01-17 21:22:14 +00:00
Jan Vesely	c45ec604f5	powr: Port from amd_builtins Passes piglit on turks and carrizo fp64 passes cts on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322762	2018-01-17 21:22:06 +00:00
Jan Vesely	5efc8fe321	pown: Port from amd_builtins Passes piglit on turks and carrizo fp64 passes CTS on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322761	2018-01-17 21:22:03 +00:00
Jan Vesely	cc5c65b2c2	pow: Port from amd_builtins Passes piglit on turks and carrizo fp64 passes CTS on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322760	2018-01-17 21:21:35 +00:00
Jan Vesely	fe7c045753	math: Implement minmag Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318265	2017-11-15 04:10:39 +00:00
Jan Vesely	7ba243cc3d	math: Implement maxmag Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318264	2017-11-15 04:10:37 +00:00
Jan Vesely	383fbd050c	native_powr: Switch implementation to native_exp2 and native_log2 v2: don't use assume check only for x<0, the other conditions are handled transparently v3: don't check inputs at all, nan propagation works as expected Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318204	2017-11-14 21:55:41 +00:00
Jan Vesely	f38b40daf7	native_divide: provide function implementation instead of macro Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318067	2017-11-13 18:28:56 +00:00
Jan Vesely	1b9825f982	native_recip: provide function implementation instead of macro Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318066	2017-11-13 18:28:53 +00:00
Jan Vesely	a6758c94ef	native_rsqrt: Switch implementation to 1 / native_sqrt Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318065	2017-11-13 18:28:51 +00:00
Jan Vesely	541a3f0758	native_tan: Switch implementation to use native_sin/native_cos Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318064	2017-11-13 18:28:48 +00:00
Jan Vesely	79b7566210	math: Use precomputed constant for log2(10.0) exp10 CTS fails with or without this change Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318063	2017-11-13 18:28:45 +00:00
Jan Vesely	6b4a625438	native_exp10: Switch implementation to llvm intrinsic v2: Use native_log2 instead of wrong constant Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317941	2017-11-10 22:16:41 +00:00
Jan Vesely	4301e6d0c9	native_sqrt: Switch implementation to llvm intrinsic Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317940	2017-11-10 22:16:39 +00:00
Jan Vesely	1f34c851e0	native_sin: Switch implementation to llvm intrinsic Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317939	2017-11-10 22:16:36 +00:00
Jan Vesely	0750b7df51	native_cos: Switch implementation to llvm intrinsic Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317938	2017-11-10 22:16:33 +00:00
Jan Vesely	edbde58de0	native_exp2: Switch implementation to llvm intrinsic Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317937	2017-11-10 22:16:31 +00:00
Jan Vesely	504f85c551	native_exp: Switch implementation to llvm intrinsic Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317936	2017-11-10 22:16:28 +00:00
Jan Vesely	adc1eaedf8	native_log10: Switch to generic native intrinsic inc file Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317934	2017-11-10 22:16:22 +00:00
Jan Vesely	086e796053	native_log: Switch to generic native intrinsic inc file Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317933	2017-11-10 22:16:20 +00:00
Jan Vesely	f58dee9f3a	native_log2: Switch to generic native intrinsic inc file v2: Add __CLC_XCONCAT instead of function name redirection Use __CLC_XCONCAT for intrinsic functions as well Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317932	2017-11-10 22:16:15 +00:00
Jan Vesely	47e093da9b	math: Implement native_log10 Use llvm instrinsic by default Provide amdgpu workaround v2: drop old amd copyrights Reviewer: Aaron Watry Reviewed-by: Vedran Miletić <vedran@miletic.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316588	2017-10-25 16:49:22 +00:00
Jan Vesely	9f7172965c	math: Implement sinh function mostly copied form amd_builtins llvm-svn: 296233	2017-02-25 02:46:53 +00:00
Aaron Watry	c606efabb7	math: Add logb builtin Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335	2017-01-18 03:14:10 +00:00
Aaron Watry	900bd7eb7f	math: Add expm1 builtin function Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334	2017-01-18 03:13:37 +00:00
Aaron Watry	af569547fa	math: Implement tgamma Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566	2016-09-15 00:17:34 +00:00
Aaron Watry	e9009cdd21	math: Implement lgamma Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565	2016-09-15 00:17:31 +00:00
Aaron Watry	0ab07e1bde	math: Implement lgamma_r Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564	2016-09-15 00:17:28 +00:00
Matt Arsenault	fbfd828d2a	Replace nextafter implementation This one passes conformance. llvm-svn: 280961	2016-09-08 16:37:56 +00:00
Tom Stellard	d835b3f1af	Implement cbrt builtin This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497	2016-07-22 23:45:15 +00:00
Tom Stellard	9cb070f96a	Implement cosh builtin This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496	2016-07-22 23:45:13 +00:00
Jan Vesely	973c1fa5f5	math: Use single precision fmax in sp path Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807	2016-05-17 19:44:01 +00:00
Jan Vesely	c374cb76f4	math: Add erf ported from amd-builtins The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766	2016-05-06 18:02:30 +00:00
Aaron Watry	55a8e0fd6d	math: Add fdim implementation Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708	2016-05-06 03:34:45 +00:00
Aaron Watry	09f3c99a86	math: Fix ilogb(double) return type Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714	2016-02-24 00:52:15 +00:00
Aaron Watry	d6d0454231	math: Add ilogb ported from amd-builtins The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639	2016-02-23 14:43:09 +00:00
Jan Vesely	7fbb96b907	math: Fix log2 vectorization on non-fp64 hw reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260301	2016-02-09 22:17:42 +00:00
Aaron Watry	8872800eff	math: Add frexp ported from amd-builtins The float implementation is almost a direct port from the amd-builtins, but instead of just having a scalar and float4 implementation, it has a scalar and arbitrary width vector implementation. The double scalar is also a direct port from AMD's builtin release. The double vector implementation copies the logic in the float vector implementation using the values from the double scalar version. Both have been tested in piglit using tests sent to that project's mailing list. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260114	2016-02-08 17:07:21 +00:00
Tom Stellard	37d19875fa	Implement modf math builtin V2: use the reference implementation as suggested by Matt Arsenault Patch By: Pavel Ondračka llvm-svn: 258933	2016-01-27 14:52:10 +00:00
Niels Ole Salscheider	f51df5ba8c	Implement tanh builtin This is a port from the AMD builtin library. llvm-svn: 248780	2015-09-29 06:39:09 +00:00
Tom Stellard	7a09e88b6e	Fix double implementation of log We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132	2015-07-24 18:07:14 +00:00
Tom Stellard	44b6117dfd	Implement accurate log2 function Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131	2015-07-24 18:07:12 +00:00
Tom Stellard	f01ffa9ddc	Use llvm intrinsics for native_log and native_log2 llvm-svn: 243130	2015-07-24 18:07:06 +00:00
Tom Stellard	2ef5ec6b2b	Fix implementation of sqrt v2 Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906	2015-07-10 13:37:07 +00:00
Tom Stellard	a64bad8338	Use a more accurate implementation for exp Using exp2(x * M_LOG2E_F) does not give us accurate enough results for OpenCL. If you look at the new exp implementation you'll see that it does multiply the input by M_LOG2E_F, but it still uses the original input in part of the calculation. This exp implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237229	2015-05-13 03:55:09 +00:00
Tom Stellard	d538fdc217	Implement exp2 using OpenCL C rather than using an intrinsic Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228	2015-05-13 03:55:07 +00:00
Tom Stellard	4294541290	Implement sin for double types This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237155	2015-05-12 17:18:47 +00:00
Tom Stellard	2e6ff0c66e	Implement cos for double types This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237154	2015-05-12 17:18:46 +00:00
Tom Stellard	37406a209c	Implement atan2pi builtin This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237138	2015-05-12 14:48:26 +00:00

1 2 3

148 Commits