LLVM converts sqrt libcall to intrinsic call if the argument is within
the range(greater than or equal to 0.0). In this case the compiler is
not able to deduce the non-negativity on its own. Extended ValueTracking
to understand such loops.
Fixesllvm/llvm-project#174813
Switch to using computeKnownBits instead of computeConstantRange
in computeKnownFPClass's ldexp handling. This is preparation to
move the handling into KnownFPClass. Since KnownFPClass is in Support,
it can make use of KnownBits as the input argument. ConstantRange is in
IR, so it cannot be used from Support.
This functionally reverts fd5cfcc41311c6287e9dc408b8aae499501660e1 and
35ce17b6f6ca5dd321af8e6763554b10824e4ac4.
This was correct and necessary, but is causing performance regressions
since isGuaranteedNotToBeUndef is apparently not smart enough to detect
through recurrences. Revert this for the release branch.
Also the test coverage was inadequate for the fma case, so add a new
case which changes with and without the check.
The handling for fma was very basic and only handled the
repeated input case. Re-use the fmul and fadd handling for more
accurate sign bit and nan handling.
This adds the analogous metadata to the nofpclass attribute
to assert values are not a certain set of floating-point classes.
This allows the same information to be expressed if a function
argument is passed indirectly. This matches the bitmask encoding
of nofpclass.
I also think this should be allowed for stores to symmetrically handle
sret, but leave that for later.
Alternatively we could add a more expressive !fprange metadata,
but that would be much more complex. It's useful to match the attribute,
and more annotations can always be added.
Fixes#133560
`dereferenceable(<n>)` with n being potentially zero can come up when
using an operand bundle with a variable size. Currently this implies
that the pointer is non-null, even though `[nullptr, nullptr)` is a
valid range in any programming language I'm aware of. This patch removes
this implication and updates the language reference to reflect that
`dereferenceable` with a zero argument is valid.
For every type other than i1, ssub.sat x, y = 0 implies x == y. But
ssub.sat.i1 0, -1 = 0 (because the result of 1 saturates to 0).
The changes to instcombine are not strictly necessary. Instcombine
canonicalizes the ssub.sat.i1 before we arrive at these pattern-matches.
The real fix is in ValueTracking.
Nonetheless we agreed in review it makes sense to add these checks to
instcombine, even though they're currently unreachable:
https://github.com/llvm/llvm-project/pull/173742#issuecomment-3696631396
This was found by a fuzzer I'm working on!
Match the structure of ComputeKnownBits. Expose the condition
handling as a utility function so SimplifyDemanedFPClass can make
use of this. Avoids some redundant code and improves accuracy in
at least one case.
Avoid depending on the SimplifyQuery's context instruction,
which may be null to query the function context to use for the
denormal mode. This avoids crashes in future patches.
Reapply the zero handling, reverted in
108a22ed5fa1836b4cfcd05e9d96f98a533068d5
The failing libc test should have been fixed by
e25eacf10c0d6718bad4e18e63757f97be9f9596
ptrtoaddr can be handled the same as ptrtoint here. The pointer known
bits cover the full pointer width, and ptrtoaddr either passes those
through directly or truncates to the address size.
This partially reverts commit 108a22ed5fa1836b4cfcd05e9d96f98a533068d5.
Restore the sign-bit tracking for both inputs known-negative case,
and leave the 0 handling for later. There is a libc test improperly
relying on running code compiled for IEEE behavior that changed
the output denormal mode.
Reverts llvm/llvm-project#174123
This caused test failures within LLVM libc. They can be reproduced by
doing a libc build against a clang with this commit included and running
`ninja -k 0 libc.test.src.math.smoke.log1p_test.__unit__
libc.test.src.math.smoke.log1p_test.__unit__.__NO_FMA_OPT`.
This already recognized that if both inputs are positive, the
result is positive. Extend this to the mirror situation with
negative inputs.
Also special case fadd x, x. Canonically, fmul x, 2 is fadd x, x.
We can tell the sign bit won't change, and 0 will propagate.
I'm working on optimizing out the tail sequences in the
implementations of the 4 different flavors of pow. These
include chains of selects on the various edge cases.
Related to #64870