When compiling on aarch64 some `LDBL_MANT_DIG == 113` entries
end up trying to use `complex<long double>` for which there are
no certain specializations in `libcudacxx`. This change-set
includes a clean-up for `LDBL_MANT_DIG == 113` usage, which is replaced
with `HAS_LDBL128` that is set in `float128.h`.
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.
Additional change on top of #109078 is to use `cuda::std::complex`
only for the device compilation, otherwise the host compilation
fails because `libcudacxx` may not support `long double` specialization
at all (depending on the compiler).
`std::complex` operators do not work for the CUDA device compilation
of F18 runtime. This change makes use of `cuda::std::complex` from
`libcudacxx`.
`cuda::std::complex` does not have specializations for `long double`,
so the change is accompanied with a clean-up for `long double` usage.
Ensure that the runtime implementations of floating-point reductions use
intermediate results of the same precision as the operands, so that
results match those from constant folding. (SUM reduction uses Kahan
summation in both cases.)
The reductions implementations rely on trivial operations that
are supported by the build compiler runtime, so they can be enabled
whenever the build compiler provides 128-bit float support.
std::conj used by DOT_PRODUCT is a template implementation
in most environments, so it should not introduce a dependency
on any 128-bit float support library. I am not goind to
test it in all the build environments before merging.
If it fails for someone, I will deal with it.
This update allows constant folding for many 128 bit floating point intrinsics
through the library quadmath, which is only available on some platforms.
Differential Revision: https://reviews.llvm.org/D156435
On targets with __float128 available and distinct from long double,
use it to support more kind=16 entry points. This affects mostly
x86-64 targets. This means that more runtime entry points are
defined for lowering to call.
Delete Common/long-double.h and its LONG_DOUBLE macro in favor of
testing the standard macro LDBL_MANT_DIG.
Differential Revision: https://reviews.llvm.org/D127025
Move the closure of the subset of flang/runtime/*.h header files that
are referenced by source files outside flang/runtime (apart from unit tests)
into a new directory (flang/include/flang/Runtime) so that relative
include paths into ../runtime need not be used.
flang/runtime/pgmath.h.inc is moved to flang/include/flang/Evaluate;
it's not used by the runtime.
Differential Revision: https://reviews.llvm.org/D109107
The single source file reduction.cpp is a little large in
terms of both source lines and generated text bytes, so
split SUM, PRODUCT, FINDLOC, and MAXLOC/MAXVAL/MINLOC/MINVAL
off into their own C++ source files that share a set of
implementation function templates now in a common header.
Differential Revision: https://reviews.llvm.org/D101111