The returned attribute can be used when it is possible to
"losslessly bitcast" between the argument and return type,
including between two vector types.
computeKnownBits() would crash in this case, isKnownNonZero()
would potentially produce a miscompile.
Fixes https://github.com/llvm/llvm-project/issues/74722.
Operator allows the phi operand to be a ConstantExpr. A ConstantExpr is
a valid operand to a phi, but is never going to be a recurrence.
We can only match a BinaryOperator so use that instead.
I'm not sure whether it's possible to cause a miscompile due to
the missing check right now, as the affected values mechanism
effectively protects us against this. This becomes a problem for
an upcoming patch though.
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.
DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.
The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.
This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.
Fixes https://github.com/llvm/llvm-project/issues/74242.
It's not safe for InstCombine to add disjoint metadata when converting
Add to Or otherwise.
I've added noundef attribute to preserve existing test behavior.
We have a bunch of places where we have to guard against undef
to avoid multi-use issues, but would be fine with poison. Use a
different function for these to make it clear, and to indicate that
this check can be removed once we no longer support undef. I've
replaced some of the obvious cases, but there's probably more.
For now, the implementation is the same as UndefOrPoison, it just
has a more precise name.
If both icmps have the same operands and the RHS is constant, we
would currently go into the isImpliedCondMatchingOperands() code
path, instead of the isImpliedCondCommonOperandWithConstants()
path. Both are correct, but the latter can produce more accurate
results if the implication is dependent on the sign.
This reverts commit 96a0d714d58e48c363ee6abbbcdfd7a6ce646ac1.
Avoid assert with dynamic denormal-fp-math We don't recognize compares
with 0 as an exact class test if we don't know the denormal mode. We could
try to do better here, but it's probably not worth it.
Fixes asserts reported after 1adce7d8e47e2438f99f91607760b825e5e3cc37
Call computeKnownBits() with SimplifyQuery to make sure it gets
all available analyses, even if more are added in the future.
As this code is performance-critical, I'm exporting the variant
with by-ref KnownBits and SimplifyQuery, as the variant returning
KnownBits is measurably slower in this context.
For most practical purposes, the only KnownBits patterns we care about are
those involving a constant comparison RHS and constant mask. However,
the actual implementation is written in a very general way -- and of
course, with basically no test coverage of those generalizations.
This patch reduces the implementation to only handle cases with constant
operands. The test changes are all in "make sure we don't crash" tests.
The motivation for this change is an upcoming patch to handling dominating
conditions in computeKnownBits(). Handling non-constant RHS would add
significant additional compile-time overhead in that case, without any
significant impact on optimization quality.
We are already iterating over all assumes in AC, so handle
operand bundle based assumes in the same loop, instead of querying
them separately.
To keep the debug counter working, make it work per-bundle rather
than per-value.
For all practical purposes, we only care about comparisons with
constant RHS in this code. In that case, an invert will be
canonicalized into the constant and it will be handled by other cases.
Given the complete lack of test coverage, I'm removing this code.
This causes asserts to fire:
llvm/lib/Analysis/ValueTracking.cpp:4262:
std::tuple<Value *, FPClassTest, FPClassTest> llvm::fcmpImpliesClass(CmpInst::Predicate, const Function &, Value *, const APFloat *, bool):
Assertion `(RHSClass == fcPosNormal || RHSClass == fcNegNormal || RHSClass == fcPosSubnormal || RHSClass == fcNegSubnormal) && "should have been recognized as an exact class test"' failed.
See comments on the PR.
> Previously we could recognize exact class tests performed by
> an fcmp with special values (0s, infs and smallest normal).
> Expand this to recognize the implied classes by a compare with a general
> constant. e.g. fcmp ogt x, 1 implies positive and non-0.
>
> The API should be better merged with fcmpToClassTest but that
> made the diff way bigger, will try to do that in a future
> patch.
This reverts commit dc3faf0ed0e3f1ea9e435a006167d9649f865da1.
Previously we could recognize exact class tests performed by
an fcmp with special values (0s, infs and smallest normal).
Expand this to recognize the implied classes by a compare with a general
constant. e.g. fcmp ogt x, 1 implies positive and non-0.
The API should be better merged with fcmpToClassTest but that
made the diff way bigger, will try to do that in a future
patch.
zext nneg was recently added to the IR in #67982. This patch teaches
demanded bits and known bits about the semantics of the instruction, and
adds a couple of test cases to illustrate basic functionality.
Basic way to recursively analyze `select` in `isKnownNonEqual`: `select
%c, %t, %f` is non-equal to `%x` if `%t` is non-equal to `%x` and `%f`
is non-equal to `%x`.
Currently, we specify that the ptrmask intrinsic allows the mask to have
any size, which will be zero-extended or truncated to the pointer size.
However, what semantics of the specified GEP expansion actually imply is
that the mask is only meaningful up to the pointer type *index* size --
any higher bits of the pointer will always be preserved. In other words,
the mask gets 1-extended from the index size to the pointer size. This
is also the behavior we want for CHERI architectures.
This PR makes two changes:
* It spells out the interaction with the pointer type index size more
explicitly.
* It requires that the mask matches the pointer type index size. The
intention here is to make handling of this intrinsic more robust, to
avoid accidental mix-ups of pointer size and index size in code
generating this intrinsic. If a zero-extend or truncate of the mask is
desired, it should just be done explicitly in IR. This also cuts down on
the amount of testing we have to do, and things transforms needs to
check for.
As far as I can tell, we don't actually support pointers with different
index type size at the SDAG level, so I'm just asserting the sizes match
there for now. Out-of-tree targets using different index sizes may need
to adjust that code.
This patch adds a new class "WithCache" which stores a pointer to
any type passable to computeKnownBits along with KnownBits
information which is computed on-demand when getKnownBits()
is called. This allows reusing the known bits information when it is
passed as an argument to multiple functions.
It also changes a few functions to accept WithCache(s) so that
known bits information computed in some callees can be propagated to
others from the top level visitAddSub caller.
This gives a speedup of 0.14%:
https://llvm-compile-time-tracker.com/compare.php?from=499d41cef2e7bbb65804f6a815b9fa8b27efce0f&to=fbea87f1f1e6d5552e2bc309f8e201a3af6d28ec&stat=instructions:u
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.
This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
We can cover more cases by directly checking if the result is
known-nonzero for common patterns when they are missing `OrZero`.
This patch add `isKnownNonZero` checks for `shl`, `lshr`, `and`, and `mul`.
Differential Revision: https://reviews.llvm.org/D157309
1) If LHS is constant:
- The low bits of the LHS is set, the lower bound is non-zero
- The upper bound can be capped at popcount(LHS) high bits
2) If RHS is constant:
- The upper bound can be capped at (Width - RHS) high bits
In the failure case we return null, which callers are checking. We were
also returning an fcNone which was unused. It's more consistent to
return fcAllFlags as any possible value, such that the value is always
directly usable without checking the returned value.
While we pretty much always want to pass DT, AC and CxtI, most
places don't care about TLI. Add an overload where this is not
one of the first parameters.