433 Commits

Author SHA1 Message Date
Fraser Cormack
06789ccb16
[libclc] Optimize ceil/fabs/floor/rint/trunc (#119596)
These functions all map to the corresponding LLVM intrinsics, but the
vector intrinsics weren't being generated. The intrinsic mapping from
CLC vector function to vector intrinsic was working correctly, but the
mapping from OpenCL builtin to CLC function was suboptimally recursively
splitting vectors in halves.

For example, with this change, `ceil(float16)` calls `llvm.ceil.v16f32`
directly once optimizations are applied.

Now also, instead of generating LLVM intrinsics through `__asm` we now
call clang elementwise builtins for each CLC builtin. This should be a
more standard way of achieving the same result

The CLC versions of each of these builtins are also now built and
enabled for SPIR-V targets. The LLVM -> SPIR-V translator maps the
intrinsics to the appropriate OpExtInst, so there should be no
difference in semantics, despite the newly introduced indirection from
OpenCL builtin through the CLC builtin to the intrinsic.

The AMDGPU targets make use of the same `_CLC_DEFINE_UNARY_BUILTIN`
macro to override `sqrt`, so those functions also appear more optimal
with this change, calling the vector `llvm.sqrt.vXf32` intrinsics
directly.
2024-12-13 08:47:13 +00:00
Fraser Cormack
a55248789e
[libclc] Avoid using undefined vector3 components (#115857)
Using '.hi' on a vector3 is technically allowed by the spec and is
treated as a 4-element vector with an "undefined" w component. However,
it's more undef/poison code for the compiler to process and remove. We
can easily avoid it with a dedicated macro.
2024-11-12 16:23:52 +00:00
Fraser Cormack
0d2ef7af19
[libclc] Use builtin_convertvector to convert between vector types (#115865)
This keeps values in vectors, rather than scalarizing them and then
reconstituting the vector. The builtin is identical to performing a
C-style cast on each element, which is what we were doing by recursively
splitting the vector down to calling the "base" conversion function on
each element.
2024-11-12 16:18:33 +00:00
Fraser Cormack
6ca50a2593 [libclc] Correct use of CLC macro on two definitions
_CLC_DECL is for declarations and _CLC_DEF for definitions, as the names
imply.

No change to any bitcode module.
2024-11-07 17:47:52 +00:00
Fraser Cormack
b231647475
[libclc] Move relational functions to the CLC library (#115171)
The OpenCL relational functions now call their CLC counterparts, and the
CLC relational functions are defined identically to how the OpenCL
functions were defined.

As usual, clspv and spir-v targets bypass these.

No observable changes to any libclc target (measured with llvm-diff).
2024-11-06 19:28:44 +00:00
Fraser Cormack
b4263ddbe7 [libclc] Use __clc_max in CLC functions 2024-11-06 09:16:36 +00:00
Fraser Cormack
7be30fd533 [libclc] Move abs/abs_diff to CLC library 2024-11-06 09:16:35 +00:00
Fraser Cormack
d2d1b5897e
[libclc] Move clcmacro.h to CLC library. NFC (#114845) 2024-11-04 22:00:01 +00:00
Fraser Cormack
293c78ba0a
[libclc] Move ceil/fabs/floor/rint/trunc to CLC library (#114774)
These functions are all mapped to LLVM intrinsics.

The clspv and spirv targets don't declare or define any of these CLC
functions, and instead map these to their corresponding OpenCL symbols.
2024-11-04 16:35:14 +00:00
Fraser Cormack
b4ef43fc75 [libclc] Format clc_fma.cl. NFC 2024-11-04 11:55:42 +00:00
Fraser Cormack
e28d7f7134 [libclc] Format clc_tan.cl. NFC 2024-11-04 10:52:46 +00:00
Fraser Cormack
f1888e4029 [libclc] Add some include guards and format a file 2024-11-04 10:37:11 +00:00
Fraser Cormack
d12a8da1de
[libclc] Move min/max/clamp into the CLC builtins library (#114386)
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.

Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
2024-10-31 16:45:37 +00:00
Fraser Cormack
86974e15f5 [libclc] Restore header order, which formatting broke 2024-10-31 10:33:47 +00:00
Fraser Cormack
fba9f05ff7 [libclc] Format clc_ldexp.cl and clc_hypot.cl. NFC 2024-10-31 10:18:29 +00:00
Fraser Cormack
b2bdd8bd39 [libclc] Create an internal 'clc' builtins library
Some libclc builtins currently use internal builtins prefixed with
'__clc_' for various reasons, e.g., to avoid naming clashes.

This commit formalizes this concept by starting to isolate the
definitions of these internal clc builtins into a separate
self-contained bytecode library, which is linked into each target's
libclc OpenCL builtins before optimization takes place.

The goal of this step is to allow additional libraries of builtins
that provide entry points (or bindings) that are not written in OpenCL C
but still wish to expose OpenCL-compatible builtins. By moving the
implementations into a separate self-contained library, entry points can
share as much code as possible without going through OpenCL C.

The overall structure of the internal clc library is similar to the
current OpenCL structure, with SOURCES files and targets being able to
override the definitions of builtins as needed. The idea is that the
OpenCL builtins will begin to need fewer target-specific overrides, as
those will slowly move over to the clc builtins instead.

Another advantage of having a separate bytecode library with the CLC
implementations is that we can internalize the symbols when linking it
(separately), whereas currently the CLC symbols make it into the final
builtins library (and perhaps even the final compiled binary).

This patch starts of with 'dot' as it's relatively self-contained, as
opposed to most of the maths builtins which tend to pull in other
builtins.

We can also start to clang-format the builtins as we go, which should
help to modernize the codebase.
2024-10-29 13:09:56 +00:00
Romaric Jodin
46223b5eae
libclc: add half version of 'sign' (#99841) 2024-07-22 11:08:56 +01:00
Romaric Jodin
d9cb65ff48
libclc: fix convert with half (#99481)
Fix following update of libclc introducing more fp16 support:
7e6a73959a
2024-07-18 15:28:58 +02:00
Romaric Jodin
7e6a73959a
libclc: increase fp16 support (#98149)
Increase fp16 support to allow clspv to continue to be OpenCL compliant
following the update of the OpenCL-CTS adding more testing on math
functions and conversions with half.

Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.
2024-07-18 12:00:41 +01:00
Romaric Jodin
932ca85680
libclc: remove __attribute__((assume)) for clspv targets (#92126)
Instead add a proper attribute in clang, and add convert it to function
metadata to keep the information in the IR. The goal is to remove the
dependency on __attribute__((assume)) that should have not be there in
the first place.

Ref https://github.com/llvm/llvm-project/pull/84934
2024-05-17 06:13:32 -07:00
Youngsuk Kim
e60b83a645 [libclc] Clarify condition expression (NFC)
Closes #91188
2024-05-14 08:51:56 -05:00
luolent
a98a6e95be
Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391)
Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868)

Parenthesis are added as requested on ternary operators with non trivial conditions.

I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.
2024-05-04 18:38:45 +01:00
Fraser Cormack
72f9881c3f
[libclc] Refactor build system to allow in-tree builds (#87622)
The previous build system was adding custom "OpenCL" and "LLVM IR"
languages in CMake to build the builtin libraries. This was making it
harder to build in-tree because the tool binaries needed to be present
at configure time.

This commit refactors the build system to use custom commands to build
the bytecode files one by one, and link them all together into the final
bytecode library. It also enables in-tree builds by aliasing the
clang/llvm-link/etc. tool targets to internal targets, which are
imported from the LLVM installation directory when building out of tree.

Diffing (with llvm-diff) all of the final bytecode libraries in an
out-of-tree configuration against those built using the current tip
system shows no changes. Note that there are textual changes to metadata
IDs which confuse regular diff, and that llvm-diff 14 and below may show
false-positives.

This commit also removes a file listed in one of the SOURCEs which
didn't exist and which was preventing the use of
ENABLE_RUNTIME_SUBNORMAL when configuring CMake.
2024-04-11 17:09:07 +01:00
Romaric Jodin
b6193a2dc2
libclc: clspv: update gen_convert.cl for clspv (#66902)
Add a clspv switch in gen_convert.cl
This is needed as Vulkan SPIR-V does not respect the assumptions
needed to have the generic convert.cl compliant on many platforms.

It is needed because of the conversion of TYPE_MAX and
TYPE_MIN. Depending on the platform the behaviour can vary, but most
of them just do not convert correctly those 2 values.

Because of that, we also need to avoid having explicit function for
simple conversions because it allows llvm to optimise the code, thus
removing some of the added checks that are in fact needed.
2024-03-14 19:56:34 +00:00
Romaric Jodin
9160f49e08
libclc: generic: add half implementation for erf/erfc (#66901)
libclc does not have a half implementation for erf/erfc
Add one based on the float implementation by extending the input and
truncating the output.
2024-01-09 16:47:53 +00:00
Fraser Cormack
37a3de1e2e libclc: Fix signed integer underflow in abs_diff
We noticed this same issue in our own implementation of abs_diff, and
the same issue also came up in the abs_diff reference function in the
OpenCL CTS.

Reviewed By: rjodinchr

Differential Revision: https://reviews.llvm.org/D159275
2023-08-31 14:28:16 +01:00
Tobias Hieta
f98ee40f4b
[NFC][Py Reformat] Reformat python files in the rest of the dirs
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: jhenderson, #libc, Mordante, sivachandra

Differential Revision: https://reviews.llvm.org/D150784
2023-05-25 11:17:05 +02:00
Kévin Petit
21508fa769 libclc: clspv: fix fma, add vstore and fix inlining issues
https://reviews.llvm.org/D147773

Patch by Romaric Jodin <rjodin@google.com>
2023-05-09 16:52:13 +01:00
Kévin Petit
1da2085a51 libclc: add clspv to targets exempt from alwaysinline
https://reviews.llvm.org/D132362

Patch by: Aaron Greig <aaron.greig@codeplay.com>
2023-02-14 18:26:42 +00:00
Matt Arsenault
4ddba3a706 libclc: Add parentheses to silence warning
Fixes #59209
2022-12-29 18:19:55 -05:00
Daniel Stone
59510c4212 libclc: Fix rounding during type conversion
The rounding during type conversion uses multiple conversions, selecting
between them to try to discover if rounding occurred. This appears to
not have been tested, since it would generate code of the form:
    float convert_float_rtp(char x)
    {
      float r = convert_float(x);
      char y = convert_char(y);
      [...]
    }

which will access uninitialised data. The idea appears to have been to
have done a char -> float -> char roundtrip in order to discover the
rounding, so do this.

Discovered by inspection.

Signed-off-by: Daniel Stone <daniels@collabora.com>

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D81999
2021-08-19 22:24:19 -07:00
Aaron Puchert
1c1a810558 libclc: Use find_package to find Python 3 and require it
The script's shebang wants Python 3, so we use FindPython3. The
original code didn't work when an unversioned python was not available.
This is explicitly allowed in PEP 394. ("Distributors may choose to set
the behavior of the python command as follows: python2, python3, not
provide python command, allow python to be configurable by an end user
or a system administrator.")

Also I think it's actually required, so let the configuration fail if we
can't find it.

Lastly remove the shebang, since the script is only run via interpreter
and doesn't have the executable bit set anyway.

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D88366
2020-10-01 22:31:33 +02:00
Daniel Stone
291bfff5db libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA
Reviewer: jenatali jvesely
Differential Revision: https://reviews.llvm.org/D85910
2020-09-16 01:37:22 -04:00
Dave Airlie
c37145cab1 libclc: Add Mesa/SPIR-V target
Add targets to emit SPIR-V targeted to Mesa's OpenCL support, using
SPIR-V 1.1.

Substantially based on Dave Airlie's earlier work.

libclc: spirv: remove step/smoothstep apis not defined for SPIR-V

libclc: disable inlines for SPIR-V builds

Reviewed By: jvesely, tstellar, jenatali

Differential Revision: https://reviews.llvm.org/D77589
2020-08-17 14:01:46 -07:00
Daniel Stone
3d21fa56f5 libclc: Make all built-ins overloadable
The SPIR spec states that all OpenCL built-in functions should be
overloadable and mangled, to ensure consistency.

Add the overload attribute to functions which were missing them:
work dimensions, memory barriers and fences, and events.

Reviewed By: tstellar, jenatali

Differential Revision: https://reviews.llvm.org/D82078
2020-08-17 13:55:48 -07:00
Boris Brezillon
3a7051d9c2 libclc: Fix FP_ILOGBNAN definition
Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and
guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that
implies fixing ilogb() implementation to return the right value.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D83473
2020-08-17 13:45:43 -07:00
Jan Vesely
efeafa1bda libclc: Use acos implementation from amd_builtins
Fixes acos CTS (1 thread, scalar) on AMD Turks.
Reviewer: tstellar
Differential Revision: https://reviews.llvm.org/D74011
2020-02-20 23:36:14 -05:00
Jan Vesely
4b23a2e8e9 libclc: Move rsqrt implementation to a .cl file
Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74013
2020-02-09 14:42:09 -05:00
Aaron Watry
64a8e1b83e libclc/asin: Switch to amd builtins version of asin
Fixes a wimpy-mode CTS failure for asin(float).

Passes non-wimpy for both float/double on RX580.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2020-02-04 14:29:20 -05:00
Jan Vesely
4a725996e5 sincos: Simplify declaration headers.
This follows the same pattern as modf and fract.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 356028
2019-03-13 07:13:34 +00:00
Jan Vesely
e7c0c37a31 fdim: Use binary_decl_tt.inc instead of custom inc file.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356027
2019-03-13 07:13:32 +00:00
Jan Vesely
5b0600c277 nextafter: Use binary_decl_tt.inc instead of custom inc file.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356026
2019-03-13 07:13:30 +00:00
Jan Vesely
e438b58cd0 copysign: Use binary_decl_tt.inc instead of custom inc file.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356025
2019-03-13 07:13:28 +00:00
Jan Vesely
81bc9ee81c atan2pi: Use binary_decl_tt.inc instead of custom inc file.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356024
2019-03-13 07:13:26 +00:00
Jan Vesely
9526e02021 atan2: Use binary_decl_tt.inc instead of custom inc file.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356023
2019-03-13 07:13:24 +00:00
Jan Vesely
8985c9c212 hypot: Use binary_decl_tt.inc instead of custom inc file
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356022
2019-03-13 07:13:22 +00:00
Jan Vesely
5b136ca125 Move unary_instrinsic.inc to private headers.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356021
2019-03-13 07:06:19 +00:00
Jan Vesely
2aa333f3d1 Move binary_intrinsic.h to private headers.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356020
2019-03-13 07:06:15 +00:00
Jan Vesely
1f4a8a9158 Move ternary_intrinsic.h to private headers.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356019
2019-03-13 07:06:13 +00:00
Jan Vesely
ee555aa992 trunc: Remove llvm intrinsic from the header.
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 356018
2019-03-13 07:06:10 +00:00