llvm-project

Author	SHA1	Message	Date
Matt Arsenault	090c40545f	libclc: Replace flush_if_daz implementation (#187569 ) The fallback non-canonicalize path didn't work. Use a more straightforward implementation. Eventually this should use the pattern from #172998	2026-03-20 08:09:16 +01:00
Wenju He	366da1252b	[libclc] Restore previous generic fmod implementation (#187470 ) Restore from before 3c7f70bb9cee for targets that do not yet implement frem. Keep the __builtin_elementwise_fmod-based implementation for AMDGPU.	2026-03-20 07:42:36 +08:00
Matt Arsenault	1f8da27714	libclc: Really implement half trig functions (#187457 ) Previously these just cast to float.	2026-03-19 09:06:28 +00:00
Matt Arsenault	1ba5b6e875	libclc: Stop implementing sincos as separate sin and cos (#187456 )	2026-03-19 09:52:30 +01:00
Matt Arsenault	6e8ca5edde	libclc: Fix nextafter with -cl-denorms-are-zero (#187358 ) Follow the suggested behavior of returning +/-FLT_MIN for logical zeros.	2026-03-19 09:43:58 +01:00
Matt Arsenault	85e9ac5898	libclc: Add canonicalize utility functions (#187357 ) This is mostly to work around spirv's canonicalize still being broken.	2026-03-19 09:43:35 +01:00
Matt Arsenault	9b7c437033	libclc: Update f64 trig functions (#187455 ) Most of of this was originally ported from rocm device libs in 2e6ff0c66e180998425776a27579559dc099732f. Merge in more recent changes.	2026-03-19 08:34:59 +00:00
Matt Arsenault	0960f0b8fe	libclc: Really implement denormal config checks (#187356 ) These should be implementable by checking the behavior of the canonicalize intrinsic. Hack around spirv still failing on canonicalize by overriding and assuming DAZ for float.	2026-03-19 08:34:43 +00:00
Matt Arsenault	a54c149061	libclc: Invert subnormal checks (#187355 ) The base case is correct denormal handling, not flushing. This also matches the spec controls, which starts at IEEE and flushing is enabled with -cl-denorms-are-zero. Also fix wrong defaults for half and double. Denormal support is not optional for these.	2026-03-19 08:25:16 +00:00
Matt Arsenault	bdfd9725af	libclc: Move subnormal config file to clc (#187354 )	2026-03-19 08:26:57 +01:00
Matt Arsenault	e3198dbe59	libclc: Move FLT_MIN gentype macros (#187272 )	2026-03-19 08:16:52 +01:00
Matt Arsenault	9e6ce65962	libclc: Fix vector float tan (#187387 )	2026-03-19 08:16:10 +01:00
Matt Arsenault	b15fa374ff	libclc: Improve float trig function handling (#187264 ) Most of this was originally ported from rocm device libs in c0ab2f81e3ab5c7a4c2e0b812a873c3a7f9dca8b, so merge in more recent changes.	2026-03-18 13:10:58 +00:00
Matt Arsenault	9b8532dd2a	libclc: Clean up sincos macro usage (#187260 ) Handle this more like fract, and implement other address spaces on top of the private overload with a temporary variable.	2026-03-18 13:56:58 +01:00
Matt Arsenault	2ecd001215	libclc: Use select function instead of ?: for some fp selects (#187253 ) It seems that ?: is not quite equivalent to select for floating-point vectors. With ?:, the resulting IR involves integer bitcasts and integer vector typed select. Use select so this is an fp-select. This enables finite math only contexts to optimize out the select. This feels like it's a clang bug though.	2026-03-18 13:44:35 +01:00
Wenju He	350385e792	[libclc][NFC] Change include style from <...> to "..." (#186537 ) project-specific headers should use "". Keep #include <amdhsa_abi.h> llvm-diff shows no change to libclc.bc for spir--, spir64--, nvptx64--, nvptx64--nvidiacl, nvptx64-nvidia-cuda and amdgcn-amd-amdhsa-llvm when LIBCLC_TARGETS_TO_BUILD is "all". Verified that reversing spir64--/libclc.spv and spir--/libclc.spv to LLVM bitcode shows no diff. Also fix `__CLC_INTEGER_CLC_BITFIELD_EXTRACT_SIGNED_H__` guard per copilot review. --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-03-18 10:29:54 +08:00
Wenju He	b14eea0b23	[libclc] Fix check-libclc dependency on llvm-dis (#186978 ) Add llvm-dis to libclc runtime dependencies.	2026-03-17 18:09:36 +08:00
Matt Arsenault	527496bb10	libclc: Improve large float trig reduction (#186984 )	2026-03-17 10:36:19 +01:00
Matt Arsenault	107b113b67	libclc: Use small trig reduction for nan (#186983 ) Nan should work on either path, but the small reduction path is smaller. There's also possible codegen benefits to knowing the large reduction will not need to handle nans.	2026-03-17 10:35:01 +01:00
Matt Arsenault	a0d6e97142	libclc: Use frexp and ldexp in trig reduction instead of bit hacking (#186982 )	2026-03-17 10:30:40 +01:00
Matt Arsenault	77ba0d9e24	libclc: Update pow functions (#186890 ) The 4 flavors of pow were originally ported from rocm device libs between c45ec604f593fcb03d770f4398142d2446017f68, cc5c65b2c25e0a82fbad95f0ce3bb5262e29eeee, and fe8e00bc3c65115b2e3d2a43cf3d0d756a934a52. Update to a newer version. Additionally expose fast variants for use by the libcall optimizer (e.g, __pow_fast) for float types.	2026-03-17 10:25:46 +01:00
Matt Arsenault	fae024aca9	libclc: Move edge case handling of trig functions (#186429 ) The explicit handling of nan is unnecessary. Clamp infinities to nan at the input. This allows optimizations of the following implementation code to take advantage of the knowledge that it does not need to handle infinities.	2026-03-17 10:16:01 +01:00
Matt Arsenault	19460ff859	libclc: Use fshr builtin in sincos helpers (#186427 )	2026-03-17 09:57:08 +01:00
Matt Arsenault	096371b7e3	libclc: Use struct for ep pair (#186973 ) This will enable use with vector types	2026-03-17 09:30:37 +01:00
Wenju He	4abb927bac	[libclc][CMake] Use clang/llvm-ar on Windows (#186726 ) When LLVM_TARGETS_TO_BUILD contains host target, runtime build sets CMAKE_C_COMPILER to clang-cl on Windows. Changes to fix build on Windows: - libclc struggles to pass specific flags to clang-cl MSVC-like interface. - compile flag handling will be consistent across all host systems. - libclc build is cross-compilation for offloading targets.	2026-03-17 09:45:52 +08:00
Wenju He	1c04e7fada	[libclc] fix compiler check with --target=spirv64 and -disable-llvm-passes (#185376 ) Fix "unknown target triple" errors when LLVM_TARGETS_TO_BUILD is empty. Adding -disable-llvm-passes reduces this to a very basic sanity check of Clang frontend. This allows the test to pass even if SPIR-V backend is not enabled, as the frontend can still generate IR for the target.	2026-03-17 07:59:14 +08:00
Joseph Huber	50f471fc62	[libclc] Add generic clc_mem_fence instruction (#185889 ) Summary: This can be made generic, which works as expected on NVPTX and SPIR-V. We do not replace this for AMDGPU because the dedicated built-in has an extra argument that controls whether or not local memory or global memory will be invalidated. It would be correct to use this generic operation there, but we'd lose that minor optimization so we likely should not regress.	2026-03-16 08:15:49 -05:00
Matt Arsenault	524b0b8b84	libclc: Remove attempt at subnormal flush from trig functions (#186424 )	2026-03-14 08:29:09 +01:00
Matt Arsenault	df4df088d8	libclc: Disable contract in trig reductions (#186432 )	2026-03-14 08:28:40 +01:00
Wenju He	e945f7afbe	[libclc][CMake] Rename opencl to clc in add_libclc_library, update comment (#186544 ) Align with cmake function name.	2026-03-14 10:04:19 +08:00
Wenju He	8175bd92ea	[libclc][CMake] Check SOURCES and LIBRARIES arguments are not empty (#186542 )	2026-03-14 08:52:32 +08:00
Wenju He	5d3aae962d	[libclc][NFC] Rename three .inc files to avoid name conflicts (#186384 ) Follow-up of 9b96ebc. There are binary_def.inc and unary_def.inc in header directory. - clc_ep.inc -> clc_ep_decl.inc - relational/binary_def.inc -> relational/relational_binary_def.inc - relational/unary_def.inc -> relational/relational_unary_def.inc	2026-03-14 07:44:11 +08:00
Wenju He	9b96ebcba5	[libclc] Rename declaration .inc files to *_decl.inc (#186340 ) These .inc files in the header directory have the same name as .inc files in implementation directory. Rename them to avoid name conflict and avoid wrong file being used in implementation. This fixes bitcode change when changing `#include <>` to `#include ""`.	2026-03-13 19:33:17 +08:00
Matt Arsenault	ecc4d3edc9	libclc: Fix mismatch in declared and defined function name (#186227 )	2026-03-12 20:21:20 +00:00
Wenju He	d352aac32c	[libclc][CMake] Add check-libclc umbrella test target (#186053 ) This allows running the full test suite using `ninja check-libclc`.	2026-03-12 19:55:18 +08:00
Matt Arsenault	a372eca60d	libclc: Improve minmag and maxmag (#186092 ) Gives slightly better codegen.	2026-03-12 12:24:07 +01:00
Matt Arsenault	85e542fff3	libclc: Improve fdim handling (#186085 ) The maxnum is somewhat overconstraining. This gives slightly better codegen and avoids the noise from the select and convert, and saves the cost of materializing the nan literal.	2026-03-12 11:52:51 +01:00
Matt Arsenault	ea86511528	libclc: Replace nextafter implementation (#186082 ) Use a more straightforward version which allows optimizations to delete the edge case checks, and also codegens better. Implement in terms of new nextup and nextdown helper functions, which are IEEE functions, and usable in other functions.	2026-03-12 11:52:34 +01:00
Matt Arsenault	3c7f70bb9c	libclc: Replace fmod implementation with elementwise builtin (#186083 ) This corresponds to frem, which for whatever reason is a first class IR instruction. The backend has a heroic freestanding implementation that should be nearly identical to what was here.	2026-03-12 11:47:39 +01:00
Matt Arsenault	d2c9ebf369	libclc: Update f64 log implementations (#186048 ) The log implementation was originally ported from rocm device libs way back in 44b6117dfde30d6cc292fabca8ecb0cef4657f7a. Update this to a version derived from the latest. Leaves the float and half cases alone.	2026-03-12 09:08:09 +00:00
Matt Arsenault	285f2debff	libclc: Add ep utility (#186047 ) Add utility for compensated arithmetic, which should be used by a number of the large functions.	2026-03-12 10:05:25 +01:00
Matt Arsenault	acc2285d18	libclc: Update logb implementation (#185881 ) Similar to the previous logb change, use a common bithacking free implementation.	2026-03-12 06:57:38 +00:00
Matt Arsenault	5365c57e14	libclc: Update ilogb implementation (#185877 ) This was originally ported from rocm device libs in d6d0454231ac489c50465d608ddf3f5d900e1535. Update for more recent changes that were made there. This avoids bithacking and improves value tracking. This also allows using a common code path for all types.	2026-03-12 06:50:02 +00:00
Matt Arsenault	3da14eeacb	libclc: Update hypot implementation (#185873 ) This avoids bithacking on the values and improves value tracking.	2026-03-12 06:43:18 +00:00
Wenju He	aa16859162	[libclc][amdgpu] Add clc qualifier functions with non-const pointer (#186015 ) Fix unresolved external functions: _Z15__clc_get_fencePv, _Z15__clc_to_globalPv, _Z14__clc_to_localPv, _Z16__clc_to_privatePv	2026-03-12 14:08:34 +08:00
Matt Arsenault	7265d89ed6	libclc: Add frexp_exp utility function (#185872 )	2026-03-12 06:54:55 +01:00
Michał Górny	eeef27e8a7	Revert "[libclc][NFC] Change include style from <...> to "..."" (#185888 ) Reverts llvm/llvm-project#185788. This change is causing test regressions in libclc, so it's definitely not "NFC", and with its size it's hard to figure out what exactly went wrong.	2026-03-12 06:08:01 +08:00
Matt Arsenault	dc93e6e3a9	libclc: Add gentype infinity macro (#185864 )	2026-03-11 13:09:28 +01:00
Wenju He	c0c8b992ef	[libclc][NFC] Change include style from <...> to "..." (#185788 ) project-specific headers should use "". Keep #include <amdhsa_abi.h>	2026-03-11 19:04:24 +08:00
Matt Arsenault	9aba26bf47	libclc: Use frexp builtins to implement frexp for amdgpu (#185637 ) This should really be the default implementation.	2026-03-11 11:17:39 +01:00

1 2 3 4 5 ...

1059 Commits