Add an optional flag to disable constant-folding for function calls.
This applies to both intrinsics and libcalls.
This is not necessary in most cases, so is disabled by default, but in
cases that require bit-exact precision between the result from
constant-folding and run-time execution, having this flag can be useful,
and may help with debugging. Cases where mismatches can occur include
GPU execution vs host-side folding, cross-compilation scenarios, or
compilation vs execution environments with different math library
versions.
This applies only to calls, rather than all FP arithmetic. Methods such
as fast-math-flags can be used to limit reassociation, fma-fusion etc,
and basic arithmetic operations are precisely defined in IEEE 754.
However, other math operations such as sqrt, sin, pow etc. represented
by either libcalls or intrinsics are less well defined, and may vary
more between different architectures/library implementations.
As this option is not intended for most common use-cases, this patch
takes the more conservative approach of disabling constant-folding even
for operations like fmax, copysign, fabs etc. in order to keep the
implementation simple, rather than sprinkling checks for this flag
throughout.
The use-cases for this option are similar to StrictFP, but it is only
limited to FP call folding, rather than all FP operations, as it is
about precise arithmetic results, rather than FP environment behaviours.
It also can be used to when linking .bc files compiled with different
StrictFP settings with llvm-link.
This is a follow up from #141845.
TargetTransformInfo::getOperandInfo needs to be updated to check for
undef values as otherwise a splat is considered a constant, and some
RISC-V cost model tests will start adding a cost to materialize the
constant.
As noted in
https://github.com/llvm/llvm-project/pull/141821#issuecomment-2917328924,
whilst we currently constant fold intrinsics of fixed-length vectors via
their scalar counterpart, we don't do the same for scalable vectors.
This handles the scalable vector case when the operands are splats.
One weird snag in ConstantVector::getSplat was that it produced a undef
if passed in poison, so this also contains a fix by checking for
PoisonValue before UndefValue.
Add constant-folding support for the maximumnum and minimumnum
intrinsics, and extend the tests to show the qnan vs snan behavior
differences between maxnum/maximum/maximumnum.
I tried to use these with a const reference in a separate patch, but the
pointers weren't marked as const. It turns out that these don't mutate
the instruction.
Per LangRef:
> The offsets are then added to the low bits of the base address up to
the index type width, with silently-wrapping two’s complement
arithmetic. If the pointer size is larger than the index size, this
means that the bits outside the index type width will not be affected.
The transform as implemented was doubly wrong, because it just truncated
the original base pointer to the index width, losing the top bits
entirely. Make sure we preserve the bits and use wrapping arithmetic
within the low bits.
These are unneeded even on AIX, PURE_WINDOWS, and ZOS (per #104706)
* HAVE_ERRNO_H: introduced by 1a93330ffa2ae2aa0b49461f05e6f0d51e8443f8 (2009) but unneeded.
The guarded ABI is unconditionally used by lldb.
* HAVE_FCNTL_H
* HAVE_FENV_H
* HAVE_SYS_STAT_H
Pull Request: https://github.com/llvm/llvm-project/pull/123087
Add constant-folding support for the NVVM intrinsics for converting
float/double to signed/unsigned int32/int64 types, including all
rounding-modes and ftz modifiers.
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
consistently name them
- move `isTriviallyScalarizable` to VectorUtils
- update all uses of the api and provide the TTI parameter
Resolves#117030
This calls into the existing constant folding for `llvm.sin` and
`llvm.cos`, which currently does not fold for any non-finite values, so
most tests are negative tests at the moment.
Note: The constant folding does not consider the `afn` fast-math flag
and will produce the same result regardless of if the flag is set.
This is a reland of #114527 that updates the syntax of one of the tests
from: `<float 1.000000e+00, float 1.000000e+00>` to `splat (float
1.000000e+00)`.
This calls into the existing constant folding for `llvm.sin` and
`llvm.cos`, which currently does not fold for any non-finite values, so
most tests are negative tests at the moment.
Note: The constant folding does not consider the `afn` fast-math flag
and will produce the same result regardless of if the flag is set.
Calls to `@llvm.abs(undef, i1 true)` and `@llvm.abs(INT_MIN, i1 true)`
can be optimized to `poison` instead of `undef`.
[Alive2](https://alive2.llvm.org/ce/z/Hg-2ug)
C's Annex F specifies that log1p +/-0.0 returns the input value;
however, this behavior is optional and host C libraries may behave
differently. This change applies the Annex F behavior to constant
folding by LLVM.
GEP offsets have sext_or_trunc semantics. We were already doing
this for the outer-most GEP, but not for the inner ones.
I believe one of the sanitizer buildbot failures was due to this,
but I did not manage to reproduce the issue or come up with a
test case. Usually the problematic case will already be folded
away due to index type canonicalization.
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`.
LLVM should not change the behavior depending on host configurations.
This reverts commit 14c7e4a1844904f3db9b2dc93b722925a8c66b27.
(llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
This is a reland of (#96287). This patch attempts to reduce the reverted
patch's clang compile time by removing #includes of float128.h and
inlining convertToQuad functions instead.
This is a reland of #96287. This change makes tests in logf128.ll ignore
the sign of NaNs for negative value tests and moves an #include <cmath>
to be blocked behind #ifndef _GLIBCXX_MATH_H.
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.
Uses of __float128 in (#94944) should be float128. Although ConstantFoldFP128
is not reliant on HAS_LOGF128, it is only used by conditional code controlled
by HAS_LOGF128, and will cause unused errors on buildbots.
This is a follow up patch from #90611 which folds logl calls in the same
manner as log.f128 calls. logl suffers from the same problem as logf128
of having slow calls to fp128 log functions which can be constant
folded. However, logl is emitted with -fmath-errno and log.f128 is
emitted by -fno-math-errno by certain intrinsics.