When SPIRV-LLVM-Translator is built in-tree (i.e., placed in
llvm/projects folder), llvm-spirv target exists.
Drop legacy llvm-spirv_target dependency (was for non-runtime build) and
add llvm-spirv to runtimes dependencies.
When LLVM_TARGETS_TO_BUILD contains host target, runtime build sets
CMAKE_C_COMPILER to clang-cl on Windows.
Changes to fix build on Windows:
- libclc struggles to pass specific flags to clang-cl MSVC-like interface.
- compile flag handling will be consistent across all host systems.
- libclc build is cross-compilation for offloading targets.
Fix "unknown target triple" errors when LLVM_TARGETS_TO_BUILD is empty.
Adding -disable-llvm-passes reduces this to a very basic sanity check
of Clang frontend. This allows the test to pass even if SPIR-V backend
is not enabled, as the frontend can still generate IR for the target.
They were droped in e20ae16ce672.
OUTPUT_FILENAME is helpful for customizing library name. PARENT_TARGET
could be helpful for customizing dependency control.
Summary:
This PR uses https://github.com/llvm/llvm-project/pull/185243 to
overhaul compilation of libclc. This brings libclc to the same kind of
compilation flow that the other GPU libraries use (compiler-rt, libc,
libc++, openmp, flang-rt).
The main brunt of this change is simply changing the SOURCES files to
CMake variables and altering the compilation. Now that these are
standard CMake libraries we do not need to bother redefining custom
library handling and targets.
This builds as a static library, which we then consume with `llvm-link`
which converts it into a single `.bc` bitcode file similarly to before.
The final result is then optimized all together.
Hopefully this doesn't break anything.
Summary:
The current handling of `libclc` is to use custom clang jobs to compile
source files. This is the way we used to compile GPU libraries, but we
have largely moved away from this. The eventual goal is to simple be
able to use `clang` like a normal project. One barrier to this is the
lack of language support for OpenCL in CMake.
This PR uses CMake's custom language support to define a minimal OpenCL
language, largely just a wrapper around `clang -x cl`. This does
nothing, just enables the support. Future PRs will need to untangle the
mess of dependencies.
The 'link+opt' steps that we now do should be able to simply be done as
a custom `llvm-link` and `opt` job on the library, which isn't ideal but
it still better than the current state.
These were quite out of date and broken. These were originally
implemented for clover, which at one point was aiming for HSA v2 ABI
near compatibility. Since clover has been removed, that path is dead.
This was also broken for the modern HSA ABIs. Update to assume the
v5 ABI.
Toolchain can specify the component to selectively install libclc to a
deploy folder. E.g. our downstream SYCL toolchain deploy:
https://github.com/intel/llvm/blob/e7b423fd517d/sycl/CMakeLists.txt#L531
Also check ARG_PARENT_TARGET is defined and non-empty.
Co-authored-by: Jinsong Ji <jinsong.ji@intel.com>
Implement generic __clc_fma with __builtin_elementwise_fma for all
targets except for r600.
Add --spirv-ext=+SPV_KHR_fma flag to SPIR-V generation. SPIR-V target
supports @llvm.fma since SPV_KHR_fma was implemented in llvm-spirv
(https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/3467) and
SPIR-V backend (8f8dfbf8c9f0).
This PR assumes SPIR-V consumer with modern hardware supports fma.
Fix downstream build warning:
redefinition of typedef 'ushort8' is a C11 feature [-Wtypedef-redefinition]
clctypes.h re-defines typedef from opencl-c-base.h. Both files are
included in libclc/opencl folder.
This PR deletes clctypes.h and includes opencl-c-base.h for both CLC and
OpenCL libraries.
Previously opencl-c-base.h was only included in OpenCL library.
This PR relates to c5cb48c39701.
Pass `OUTPUT_FILENAME` to `add_libclc_builtin_set` to allow downstream
output naming (e.g. libspirv in
https://github.com/intel/llvm/tree/sycl/libclc).
Rename top-level target to `libclc-${ARG_TRIPLE}` to avoid collision
with `library-${ARG_ARCH_SUFFIX}` in our downstream when libclc TRIPLE
matches libspirv ARCH_SUFFIX.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Summary:
Right now all these libraries are installed in the `libclc/` directory
in the compiler's resource directory. We should instead follow the
per-target approach and install it according to the triple. The
sub-architectures will be put in a subdirectory as well.
I will do refactorings on this later to remove all the redundant targets
and pull this into common handling.
Also we should accept `--libclc-lib` without an argument to just find it
by default. I don't know what the plan here is since AMDGCN is the only
triple that uses this flag.
Revert --override flag added in 28d9255aa7c0 and avoid defining the same
symbol across multiple files of a target, simplifying the build and
easing the transition to CMake add_library for libclc.
amdgcn ldexp now uses __builtin_elementwise_ldexp.
No functional changes to clc_sqrt or clc_rsqrt.
Summary:
This utility is unnecessary with the current usage. Right now it sets
linkage to linkonce_odr and deduplicates metadata nodes. The former is
not required as `-mlink-builtin-bitcode` will internalize all functions
anyway. The deduplication is no longer necessary as `llvm-link` handles
that. Removing this simplifies complexity and make it easier to
cross-build this utility as it no longer depends on host LLVM utilities
to be built in the proejct itself.
Summary:
The standard `nvptx` target is not supported and has been completely
removed following the CUDA 12 release. It should be safe to drop support
for it in the default build. Additionally we add the `amd` vendor to the
amdgcn triple as this is the more canonical form and builds the same IR.
Remove the Python dependency for generating convert builtins, aligning
with how other builtins are defined.
In addition, our downstream target relies on this PR to override convert
implementations.
llvm-diff shows no changes to all bitcodes:
amdgcn--amdhsa.bc, barts-r600--.bc, cayman-r600--.bc, cedar-r600--.bc,
clspv64--.bc, clspv--.bc, cypress-r600--.bc, nvptx64--.bc,
nvptx64--nvidiacl.bc, nvptx--.bc, nvptx--nvidiacl.bc, tahiti-amdgcn--.bc
and tahiti-amdgcn-mesa-mesa3d.bc.
Implement atomic_*_explicit (e.g. atomic_store_explicit) with
memory_order plus optional memory_scope.
OpenCL memory_order maps 1:1 to Clang (e.g. OpenCL memory_order_relaxed
== Clang __ATOMIC_RELAXED), so we pass it unchanged to clc_atomic_*
function which forwards to Clang _scoped_atomic* builtins.
Other changes:
* Add __opencl_get_clang_memory_scope helper in opencl/utils.h (OpenCL
scope -> Clang scope).
* Correct atomic_compare_exchange return type to bool.
* Fix atomic_compare_exchange to return true when value stored in the
pointer equals expected value.
* Remove volatile from CLC functions so that volatile isn't present in
LLVM IR.
* Add '-fdeclare-opencl-builtins -finclude-default-header' flag to
include
declaration of memory_scope. Some constants in libclc are already
provided
by Clang’s OpenCL header; disable those in OpenCL library build and
enable them only for CLC library build.
Commit df7473673214b placed libclc libraries into clang resource dir
<resource-dir>/lib/libclc at build stage.
This PR does it at install stage as well.
Note that in standalone (not in-tree) build, libclc is still installed
to old ${CMAKE_INSTALL_DATADIR}/clc dir.
The flag was added in 8ef48d07efa3 to suppress build warning and is no
longer needed.
It adds "no-builtins" attribute, which prevents libclc functions from
being inlined into caller that don't have the attribute.
The flag is meant to prevent folding standard library calls into
optimized implementations. For libclc device targets, however, such
target‑driven folding is desirable.
llvm-diff shows no change to amdgcn--amdhsa.bc and nvptx--nvidiacl.bc.
Co-authored-by: Mészáros Gergely <gergely.meszaros@intel.com>
* Replace call-site check with external declaration scan (grep declare)
to avoid false positives for not-inlined __clc_* functions.
* _clc_get_el* helpers are defined as inline in clc_shuffle2.cl, so they
have available_externally attribute. When they fail to inline they are
deleted by EliminateAvailableExternallyPass and become unresolved in
cedar-r600--.bc. Mark them static to resolve the issue.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Before this PR, weak linkage is applied to a few CLC generic functions
to allow target specific implementation to override generic one.
However, adding weak linkage has a side effect of preventing
inter-procedural optimization, such as PostOrderFunctionAttrsPass,
because weak function doesn't have exact definition (as determined by
hasExactDefinition in the pass).
This PR resolves the issue by adding --override flag for every
non-generic bitcode file in llvm-link run. This approach eliminates the
need for weak linkage while still allowing target-specific
implementation to override generic one.
llvm-diff shows imporoved attribute deduction for some functions in
amdgcn--amdhsa.bc, e.g.
%23 = tail call half @llvm.sqrt.f16(half %22)
=>
%23 = tail call noundef half @llvm.sqrt.f16(half %22)
libclc sequential build issue addressed in commit 0c21d6b4c8ad is
specific to cmake MSVC generator. Therefore, this PR avoids creating a
large number of targets when a non-MSVC generator is used, such as the
Ninja generator, which is used in pre-merge CI on Windows in
llvm-project repo. We plan to migrate from MSVC generator to Ninja
generator in our downstream CI to fix flaky cmake bug `Cannot restore
timestamp`, which might be related to the large number of targets.
Fix a regression of df7473673214.
cmake MSVC generator is multiple configurations. Build type is not known
at configure time and CMAKE_CFG_INTDIR is evaluated to $(Configuration)
at configure time. libclc install fails since $(Configuration) in
bitcode file path is unresolved in libclc/cmake_install.cmake at install time.
We need a solution that resolves libclc bitcode file path at install
time. This PR fixes the issue using CMAKE_INSTALL_CONFIG_NAME which can
be evaluated at install time. This is the same solution as in
https://reviews.llvm.org/D76827
The target's output bitcode `libclc_builtins_lib` is located in a
sub-directory in clang resource directory since df7473673214. Setting
TARGET_FILE property can allow targets in non-libclc project to obtain
the path to `libclc_builtins_lib`.
This removes the dependency on an external tool to build the SPIR-V
files. It may be of interest to projects such as Mesa.
Note that the option is off by default as using the SPIR-V backend, at
least on my machine, uses a *lot* of memory and the process is often
killed in a parallelized build. It does complete, however.
Fixes#135327.
With libclc being a 'runtime', the top-level build assumes that there is
a corresopnding 'libclc' target. We previously weren't providing this,
leading to a build failure if the user tried to build it.
This commit remedies this by adding support for building the 'libclc'
target. It does so by adding dependencies from the OpenCL builtins to
this target. It uses a configurable in-between target -
libclc-opencl-builtins - to ease the possibility of adding non-OpenCL
builtin libraries in the future.
Fix the symlink creation logic to use relative paths instead of
absolute, in order to ensure that the installed symlinks actually refer
to the installed .bc files rather than the ones from the build
directory. This was broken in #146833. The change is a bit roundabout
but it attempts to preserve the spirit of #146833, that is the ability
to use multiple output directories (provided they all resides in
`${LIBCLC_OUTPUT_LIBRARY_DIR}` and preserve the same structure in the
installed tree).
Signed-off-by: Michał Górny <mgorny@gentoo.org>
The prepare target was depending on the output of a custom command, but
wasn't the full path to that file. This tripped up CMake if the file was
removed as it didn't know how to rebuild that file.
These changes were split off from #146503.
This commit makes the output directories of libclc artefacts explicit.
It creates a variable for the final output directory -
LIBCLC_OUTPUT_LIBRARY_DIR - which has not changed. This allows future
changes to alter the output directory more simply, such as by pointing
it to somewhere inside clang's resource directory.
This commit also changes the output directory of each target's
intermediate builtins.*.bc files. They are now placed into each
respective libclc target's object directory, rather than the top-level
libclc binary directory. This should help keep the binary directory a
bit tidier.
This target provides a unified build target for all devices under the
single triple. This way a user doesn't have to know device names to
build a specific target's bytecode libraries.
Device names may be considered as internal implementation details as
they are not exposed to users of CMake; users only specify triples to
build. Now, instead of `prepare-{barts,cayman,cedar,cypress}-r600--.bc`,
for example, a user may now build simply `prepare-r600--` and have all
four of those libraries built.
This commit also refactors the CMake somewhat. We were previously
diverging between the SPIR-V and other targets, and duplicating a bit of
logic like the creation of the 'prepare' targets, the targets'
properties, and the installation directory. It's cleaner and hopefully
more robust to share this code between all targets. This commit also
takes this opportunity to improve some comments around this code.
This commits moves all OpenCL builtins under a top-level 'opencl'
directory, akin to how the CLC builtins are organized. This new
structure aims to better convey the separation of the two layers and
that 'CLC' is not a subset of OpenCL or a libclc target.
In doing so this commit moves the location of the 'lib' directory to
match CLC: libclc/generic/lib/ becomes libclc/opencl/lib/generic/. This
allows us to remove some special casing in CMake and ensure a common
directory structure.
It also tries to better communicate that the OpenCL headers are
libclc-specific OpenCL headers and should not be confused with or used
as standard OpenCL headers. It does so by ensuring includes are of the
form <clc/opencl/*>. It might be that we don't specifically need the
libclc OpenCL headers and we simply could use clang's built-in
declarations, but we can revisit that later.
Aside from the code move, there is some code formatting and updating a
couple of OpenCL builtin includes to use the readily available gentype
helpers. This allows us to remove some '.inc' files.
This enables file_specific_compile_options to take precedence over
ARG_COMPILE_FLAGS. For example, if we add -fno-slp-vectorize to
COMPILE_OPTIONS of a file, the behavior changes as follows:
* Before this PR: -fno-slp-vectorize is overwritten by -O3, resulting in
SLP vectorizer remaining enabled.
* After this PR: -fno-slp-vectorize overwrites -O3, effectively
disabling SLP vectorizer.
llvm-diff shows this PR has no changes to amdgcn--amdhsa.bc.
Motivation is that in our downstream the same category of target
built-ins, e.g. math, are organized in several different folders. For
example, in target SOURCES we have math-common/cos.cl, while in generic
SOURCES it is math/cos.cl. Based on current check rule that compares
both folder name and base filename, target math-common/cos.cl won't
override math/cos.cl when collecting source files from SOURCES files in
cmake function libclc_configure_lib_source.
With this PR, we allow folder name to be different in the process.
A notable change of this PR is that two entries in SOURCES with the same
base filename must not implements the same built-in.
In libclc, we observe that compiling OpenCL source files to bitcode is
executed sequentially on Windows, which increases debug build time by
about an hour.
add_custom_command may introduce additional implicit dependencies, see
https://gitlab.kitware.com/cmake/cmake/-/issues/17097
This PR adds a target for each command, enabling parallel builds of
OpenCL source files.
CMake 3.27 has fixed above issue with DEPENDS_EXPLICIT_ONLY. When LLVM
upgrades cmake vertion to 3.7, we can switch to DEPENDS_EXPLICIT_ONLY.
The libclc build system isn't well set up to pass arbitrary options to
arbitrary source files in a non-intrusive way. There isn't currently any
other motivating example to warrant rewriting the build system just to
satisfy this requirement. So this commit uses a filename-based approach
to inserting this option into the list of compile flags.
Currently link_bc command depends on the bitcode file that is associated
with custom target builtins.link.clc-arch_suffix.
On windows we randomly see following error:
`
Generating builtins.link.clc-${ARCH}--.bc
Generating builtins.link.libspirv-${ARCH}.bc
error : The requested operation cannot be performed on a file with a
user-mapped section open.
`
I suspect that builtins.link.clc-${ARCH}--.bc file is being generated
while it is being used in link_bc.
This PR adds target-level dependency to ensure
builtins.link.clc-${ARCH}--.bc is generated first.
When -internalize flag is passed to llvm-link, we only need to link in
needed symbols. This PR reduces size of linked bitcode, e.g. by removing
following symbols:
_Z12__clc_sw_fmaDv16_fS_S_
_Z12__clc_sw_fmaDv2_fS_S_
_Z12__clc_sw_fmaDv3_fS_S_
_Z12__clc_sw_fmaDv4_fS_S_
_Z12__clc_sw_fmaDv8_fS_S_
_Z12__clc_sw_fmafff
During a recent change, the build system accidentally dropped the
(theoretical) support for the CLC builtins library to build
target-specific builtins from the 'amdgpu' directory, due to a change in
variable names. This functionality wasn't being used but was spotted
during another code review.
This commit takes the opportunity to clean up and better document the
code that manages the list of directories to search for builtin
implementations.
While fixing this, some references to now-removed SOURCES files were
discovered which have been cleaned up.