InstSimplify currently folds patterns like `(x | y) uge x` and `(x & y)
ule x` to true. However, it cannot handle combinations of such
situations, such as `(x | y) uge (x & z)` etc.
To support this, recursively collect operands of monotonic instructions
(that preserve either a greater-or-equal or less-or-equal relationship)
and then check whether any of them match.
Fixes https://github.com/llvm/llvm-project/issues/69333.
Since cd16b07 (IR: introduce CmpInst::isEquivalence), there is now an
isEquivalence routine in CmpInst that we can use to determine
equivalence in simplifySelectWithICmpEq. Implement this, extending the
code from integer-equalities to integer and floating-point equivalences.
InstSimplify currently folds alloc1 == alloc2 to false, even if one of
them is a zero-size allocation. A zero-size allocation may have the same
address as another allocation.
This also disables the fold for the case where we're comparing a
zero-size alloc with the middle of another allocation. It's possible
that this case is legal to fold depending on our precise zero-size
allocation semantics, but LangRef currently doesn't specify this either
way, so we shouldn't make assumptions here.
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.
Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
decomposeBitTestICmp() currently returns the result via two out
parameters plus an in-place modification of Pred. This changes it to
return an optional struct instead.
The motivation here is twofold. First, I'd like to extend this code to
handle cases where the comparison is against a value other than zero,
which would mean yet another out parameter. Second, while doing that I
was badly bitten by the in-place modification, so I'd like to get rid of
it.
The final case in Simplify (where Res == Absorber and the predicate is
inverted) is not generally safe when the simplification is a refinement.
In particular, we may simplify assuming a specific value for undef, but
then chose a different one later.
However, it *is* safe to refine poison in this context, unlike in the
equivalent select folds. This is the reason why this fold did not use
AllowRefinement=false in the first place, and using that option would
introduce a lot of test regressions.
This patch takes the middle path of disabling undef refinements in
particular using the getWithoutUndef() SimplifyQuery option. However,
this option doesn't actually work in this case, because the problematic
fold is inside constant folding, and we currently don't propagate this
option all the way from InstSimplify over ConstantFolding to
ConstantFold. Work around this by explicitly checking for undef operands
in simplifyWithOpReplaced().
Finally, make sure that places where AllowRefinement=false also use
Q.getWithoutUndef(). I don't have a specific test case for this (the
original one does not work because we don't simplify selects with
constant condition in this mode in the first place) but this seems like
the correct thing to do to be conservative.
Fixes https://github.com/llvm/llvm-project/issues/98753.
This patch avoids calling `isKnownNeverNaN` in `simplifyAndOrOfFCmps`
since `fcmp ord/uno X, NNAN` will be canonicalized into `fcmp ord/uno X,
0.0` in InstCombine.
In InstSimplify we already fold `fcmp ord/uno` to a constant when both
operands are known to be non-NaN. This change slightly generalizes this
to also handle the case where either of the operands is known to always
be NaN.
Proof: https://alive2.llvm.org/ce/z/AhCmJN
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
This patch adds folds for the cases where both operands are the same or
where it can be established that the first operand is less than, equal
to, or greater than the second operand.
Reapplying without changes. The flang+openmp buildbot failure
should be addressed by https://github.com/llvm/llvm-project/pull/94541.
-----
This is a followup to https://github.com/llvm/llvm-project/pull/93823
and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are
now left to the DataLayout-aware constant folder, which will fold
everything to a single i8 GEP.
We didn't have any test coverage for this fold in LLVM, but some Clang
tests change.
This is a followup to https://github.com/llvm/llvm-project/pull/93823
and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are
now left to the DataLayout-aware constant folder, which will fold
everything to a single i8 GEP.
We didn't have any test coverage for this fold in LLVM, but some Clang
tests change.
This preserves the flags if a constexpr GEP is created (at least
as long as they don't get dropped later -- the test cases uses a
constexpr index to avoid that).
foldIdentityShuffles requires two sets of canceling shuffles. If there
are any intervening instructions, they are feeding in the result of the
first set of shuffles. To eliminate the two sets of shuffles, you'd have
to rewrite the head of the intervening instructions to feed in the
operand of the first set of shuffles. Since modifying the IR in any way
is disallowed by an analysis, strip this bad TODO.
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.
Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.
At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.
I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.
This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.
Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.
Main problem solved would be something like this:
store i17 -1, ptr %p, align 4
%v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.
If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.
Fixes: https://github.com/llvm/llvm-project/issues/81793
Fold gc.relocate of undef and null to undef and null respectively.
Similar transform is currently done by instcombine, but there is no
reason to not include it here as well.