This optimization already exists, but for the libcall versions of these
functions and not for their intrinsic form.
Solves https://github.com/llvm/llvm-project/issues/139044.
There are probably more opportunities for other intrinsics, because the
switch-case in `LibCallSimplifier::optimizeCall` covers only `pow`,
`exp2`, `log`, `log2`, `log10`, `sqrt`, `memset`, `memcpy` and
`memmove`.
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
When using non-integral pointer types, such as on CHERI targets, size_t
is equivalent
to the index size, which is allowed to be smaller than the size of the
pointer.
Writing a test for this transitively exposed a number of places in
BuildLibCalls where
we were failing to propagate address spaces properly, which are
additionally fixed.
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.
The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.
Fixes#112633
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
Similar to 112aac4e8961b9626bb84f36deeaa5a674f03f5a, this converts log
libcalls to llvm.log.f64 intrinsics if we know they do not set errno, as
the input is not zero and not negative. As log will produce errno if the
input is 0 (returning -inf) or if the input is negative (returning nan),
we also perform the conversion when we have noinf and nonan.
This attempts to fix the ASan buildbot, which is detecting that CI is used
after it is removed in substituteInParent. The idea was to make sure it was
removed even if it had side-effects writing errno, but that appears to happen
if we return FRem directly as usual.
fmod will be folded to frem in clang under -fno-math-errno and can be constant
folded in llvm if the operands are known. It can be relatively common to have
fp code that handles special values before doing some calculation:
```
if (isnan(f))
return handlenan;
if (isinf(f))
return handleinf;
..
fmod(f, 2.0)
```
This patch enables the folding of fmod to frem in instcombine if the first
parameter is not inf and the second is not zero. Other combinations do not set
errno.
The same transform is performed for fmod with the nnan flag, which implies the
input is known to not be inf/zero.
The `ch` argument of memcmp should be truncated to `unsigned char`
before using it in comparisons. This didn't happen on all code paths.
The following program miscompiled at -O1 and higher:
```C++
#include <cstring>
#include <iostream>
char ch = '\x81';
int main() {
bool found = std::strchr("\x80\x81\x82", ch) != nullptr;
std::cout << std::boolalpha << found << '\n';
}
```
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
Do not require a libm ldexp libcall to emit the ldexp intrinsic when
transforming pow(2, x) -> ldexp(1, x)
This enables the half intrinsic case to fold.
When folding exp2(itofp(x)) to ldexp(1, x), don't require an ldexp
libcall to emit the intrinsic.
The intrinsic needs to be handled regardless of whether the system has a
libcall, and we have an inline implementation of ldexp already. This
fixes the instance in the exp2->ldexp fold. Another instance exists for
the pow(2) -> ldexp case
The LTO test change isn't ideal, since it's just moving the problem to
another instance where we're relying on implied libm behavior for an
intrinsic transform. Use exp10 since that's a harder case to solve in
the libcall house of cards we have.
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
optimizeTan has been renamed to optimizeTrigInversionPairs as a result.
Sadly, this is not mathematically true that all inverse pairs fold to x.
For example, asin(sin(x)) does not fold to x if x is over 2pi.
Under fast-math flags it's possible to convert `sqrt(exp(X)) `into
`exp(X * 0.5)`. I suppose that this transformation is always profitable.
This is similar to the optimization existing in GCC.
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.
Example (extracted from
[fmt/format.h](e17bc67547/include/fmt/format.h (L3555-L3566))):
```
define float @test(float %x, i1 %cond) {
%i32 = bitcast float %x to i32
%cmp = icmp slt i32 %i32, 0
br i1 %cmp, label %if.then1, label %if.else
if.then1:
%fneg = fneg float %x
br label %if.end
if.else:
br i1 %cond, label %if.then2, label %if.end
if.then2:
br label %if.end
if.end:
%value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
%ret = call float @llvm.fabs.f32(float %value)
ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
Fixes https://github.com/llvm/llvm-project/issues/76549
The cause of the optimization miss was -
1. `optimizePow` converting almost integer FP exponents to integer, and
turning `pow` to `powi`.
2. `optimizeLog` not accepting `Intrinsic::powi` as a target.
This patch converts constantInt back to constantFP where applicable and
adds a test.