Move logic for inferring `KnownFPClass` from known bits into the Support
library so the logic may be used e.g., for analogous value tracking
functions in SelectionDAG.
BasicBlock::getTerminator() is frequently called on valid IR, yet the
function has to check that the last instruction is in fact a terminator,
even in release builds. This check can only be optimized away when the
instruction is dereferenced.
Therefore, introduce the functions hasTerminator() and
getTerminatorOrNull() as replacement and require (assert) that
getTerminator() always returns a valid terminator. As a side effect,
this forces explicit expression of intent at call sites when unfinished
basic blocks should be supported.
This removes dyn_cast invocations where the argument is already of the
target type (including through subtyping). This was created by adding a
static assert in dyn_cast and letting an LLM iterate until the code base
compiled. I then went through each example and cleaned it up. This does
not commit the static assert in dyn_cast, because it would prevent a lot
of uses in templated code. To prevent backsliding we should instead add
an LLVM aware version of
https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html
(or expand the existing one).
`frem` only produces finite numbers or NaN, never +/-Inf. Before the
patch `computeKnownFPClass` failed to clear the `fcInf` mask for
`Instruction::FRem`, causing potential missed optimizations.
Fix#186746.
This was only called on CondBr instructions, where it is always faster
to access the successors directly than to use successors().
Multi-edges don't dominate anything, so this rare case is often already
handled by dominates().
There is also a very small (hardly measurable) performance
improvement here (it did show up in profiles at 0.03% or so).
BranchInst currently represents both unconditional and conditional
branches. However, these are quite different operations that are often
handled separately. Therefore, split them into separate opcodes and
classes to allow distinguishing these operations in the type system.
Additionally, this also slightly improves compile-time performance.
In investigating #156233, it came up that select folds like here:
https://alive2.llvm.org/ce/z/Y6jzj6 cannot be carried out, or easily
fixed for now, because integer reductions do not propagate noundef, even
if their arguments are noundef. This patch adds this propagation.
This helps canonicalize some address calculation. This would further
help immediate folding into memory load instructions in the backend.
The order changes to v_mad_u32_u24 is just because
@llvm.amdgcn.mul.u24.i32 was used in codegen prepare after this change.
It does not really change anything important.
LLVM converts sqrt libcall to intrinsic call if the argument is within
the range(greater than or equal to 0.0). In this case the compiler is
not able to deduce the non-negativity on its own. Extended ValueTracking
to understand such loops.
Have created new ABI's for matching Intrinsics with three operands
(those existed only for 2 operands)
`matchSimpleTernaryIntrinsicRecurrence` and `matchThreeInputRecurrence`.
Fixes https://github.com/llvm/llvm-project/issues/174813
This is another instance of the logic from #183159. If we know
one source is not-infinity, and the other source is less than or
equal to 1, this cannot overflow. Special case llvm.amdgcn.trig.preop,
as a substitute for proper range tracking. This almost enables pruning
edge case handling in trig function implementations, if not for the
recursion depth limit (but that's a problem for another day).
Reverts llvm/llvm-project#180355
It caused assert failures:
```
llvm/include/llvm/IR/InstrTypes.h:2351:
Value *llvm::CallBase::getOperand(unsigned int) const:
Assertion `i_nocapture < OperandTraits<CallBase>::operands(this) &&
"getOperand() out of range!"' failed.
```
See comment on the PR for a reproducer.
`Constant::isZeroValue` currently behaves same as
`Constant::isNullValue` for all types except floating-point, where it
additionally returns true for negative zero (`-0.0`). However, in
practice, almost all callers operate on integer/pointer types where the
two are equivalent, and the few FP-relevant callers have no meaningful
dependence on the `-0.0` behavior.
This PR removes `isZeroValue` to eliminate the confusing API. All
callers are changed to `isNullValue` with no test failures.
`isZeroValue` will be reintroduced in a future change with clearer
semantics: when null pointers may have non-zero bit patterns,
`isZeroValue` will check for bitwise-all-zeros, while `isNullValue` will
check for the semantic null (which
may be non-zero).
LLVM converts sqrt libcall to intrinsic call if the argument is within
the range(greater than or equal to 0.0). In this case the compiler is
not able to deduce the non-negativity on its own. Extended ValueTracking
to understand such loops.
Fixesllvm/llvm-project#174813
Switch to using computeKnownBits instead of computeConstantRange
in computeKnownFPClass's ldexp handling. This is preparation to
move the handling into KnownFPClass. Since KnownFPClass is in Support,
it can make use of KnownBits as the input argument. ConstantRange is in
IR, so it cannot be used from Support.
This functionally reverts fd5cfcc41311c6287e9dc408b8aae499501660e1 and
35ce17b6f6ca5dd321af8e6763554b10824e4ac4.
This was correct and necessary, but is causing performance regressions
since isGuaranteedNotToBeUndef is apparently not smart enough to detect
through recurrences. Revert this for the release branch.
Also the test coverage was inadequate for the fma case, so add a new
case which changes with and without the check.
The handling for fma was very basic and only handled the
repeated input case. Re-use the fmul and fadd handling for more
accurate sign bit and nan handling.
This adds the analogous metadata to the nofpclass attribute
to assert values are not a certain set of floating-point classes.
This allows the same information to be expressed if a function
argument is passed indirectly. This matches the bitmask encoding
of nofpclass.
I also think this should be allowed for stores to symmetrically handle
sret, but leave that for later.
Alternatively we could add a more expressive !fprange metadata,
but that would be much more complex. It's useful to match the attribute,
and more annotations can always be added.
Fixes#133560
`dereferenceable(<n>)` with n being potentially zero can come up when
using an operand bundle with a variable size. Currently this implies
that the pointer is non-null, even though `[nullptr, nullptr)` is a
valid range in any programming language I'm aware of. This patch removes
this implication and updates the language reference to reflect that
`dereferenceable` with a zero argument is valid.
For every type other than i1, ssub.sat x, y = 0 implies x == y. But
ssub.sat.i1 0, -1 = 0 (because the result of 1 saturates to 0).
The changes to instcombine are not strictly necessary. Instcombine
canonicalizes the ssub.sat.i1 before we arrive at these pattern-matches.
The real fix is in ValueTracking.
Nonetheless we agreed in review it makes sense to add these checks to
instcombine, even though they're currently unreachable:
https://github.com/llvm/llvm-project/pull/173742#issuecomment-3696631396
This was found by a fuzzer I'm working on!