20 Commits

Author SHA1 Message Date
Fraser Cormack
9705500582
[libclc] Move nextafter to the CLC library (#124097)
There were two implementations of this - one that implemented nextafter
in software, and another that called a clang builtin. No in-tree targets
called the builtin, so all targets build the software version. The
builtin version has been removed, and the software version has been
renamed to be the "default".

This commit also optimizes nextafter, to avoid scalarization as much as
possible. Note however that the (CLC) relational builtins still
scalarize; those will be optimized in a separate commit.

Since nextafter is used by some convert_type builtins, the diff to IR
codegen is not limited to the builtin itself.
2025-01-23 12:24:16 +00:00
Fraser Cormack
d2d1b5897e
[libclc] Move clcmacro.h to CLC library. NFC (#114845) 2024-11-04 22:00:01 +00:00
Jan Vesely
70a270da5f Add initial support for half precision builtins
v2: fix fmax implementation
    use consistent checks for __CLC_FP_SIZE
    add missing TODOs
    fix whitespace in definitions.h
v3: undef ZERO in modf.inc

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: Jeroen Ketema <j.ketema@xs4all.nl>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 332677
2018-05-17 22:55:30 +00:00
Jan Vesely
b424954682 amdgpu/half_recip: Switch implementation to native_recip
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325061
2018-02-13 22:09:46 +00:00
Jan Vesely
ed28c4458a amdgpu/half_log2: Switch implementation to native_log2
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325060
2018-02-13 22:09:44 +00:00
Jan Vesely
86cbf56a4b amdgpu/half_log10: Switch implementation to native_log10
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325059
2018-02-13 22:09:42 +00:00
Jan Vesely
65fd65efbf amdgpu/half_log: Switch implementation to native_log
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325058
2018-02-13 22:09:41 +00:00
Jan Vesely
2d3b6dfdca amdgpu/half_exp2: Switch implementation to native_exp2
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325057
2018-02-13 22:09:38 +00:00
Jan Vesely
021264c75a amdgpu/half_exp10: Switch implementation to native_exp10
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325056
2018-02-13 22:09:37 +00:00
Jan Vesely
4879dd7471 amdgpu/half_exp: Switch implementation to native_exp
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325055
2018-02-13 22:09:35 +00:00
Jan Vesely
bca92445ba amdgpu/half_sqrt: Switch implementation to native_sqrt
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325054
2018-02-13 22:09:33 +00:00
Jan Vesely
aad28681c2 amdgpu/half_rsqrt: Switch implementation to native_rsqrt
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325053
2018-02-13 22:09:31 +00:00
Jan Vesely
8dc6e98d47 amdgpu: Add workaround for unimplemented llvm.exp intrinsic
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317935
2017-11-10 22:16:25 +00:00
Jan Vesely
47e093da9b math: Implement native_log10
Use llvm instrinsic by default
Provide amdgpu workaround

v2: drop old amd copyrights

Reviewer: Aaron Watry
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 316588
2017-10-25 16:49:22 +00:00
Jan Vesely
9fedbb9d8e amdgpu/math: Don't use llvm instrinsic for native_log
AMDGPU targets don't have insturction for it,
so it'll be expanded to C * log2 anyway.

v2: use native_log2 instead of the more precise sw implementation
v3: move to amdgpu
v4: drop old AMD copyright

Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 316587
2017-10-25 16:49:17 +00:00
Jan Vesely
1de1444d62 Do not include clc_nextafter header globally
Drop unused clc/math/clc_nextafter.h header

Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 315190
2017-10-08 19:33:58 +00:00
Matt Arsenault
fbfd828d2a Replace nextafter implementation
This one passes conformance.

llvm-svn: 280961
2016-09-08 16:37:56 +00:00
Matt Arsenault
633d749da7 amdgpu: Use right builtn for rsq
The r600 path has never actually worked sinced double is not implemented
there.

llvm-svn: 276009
2016-07-19 19:02:01 +00:00
Matt Arsenault
b456c6dd56 Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexp
It didn't really work on r600 to begin with, which should
get its own intrinsic.

llvm-svn: 275813
2016-07-18 16:42:50 +00:00
Matt Arsenault
a48e15c6cb Split sources for amdgcn and r600
Most files remain in a common amdgpu directory.

Also switches barriers to to use convergent,
and use llvm.amdgcn.s.barrier.

This now requires 3.9/trunk to build amdgcn.

llvm-svn: 260777
2016-02-13 01:01:59 +00:00