Infer the 'exact' flag on an 'lshr' or 'ashr' instruction when the shift
amount is computed via a 'cttz' intrinsic on the same operand.
Proof: https://alive2.llvm.org/ce/z/CQR2PGFixes#131444.
This reapplies #132522.
Previously casts of scalable m_ImmConstant splats weren't being folded
by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant
should not fail" assertion.
There are no changes to the code in this PR, instead we just needed
#133207 to land first.
A test has been added for the assertion in
llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll
@icmp_ult_sext_scalable_splat_is_true.
<hr/>
#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.
However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.
This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.
But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.
I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.
#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.
However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.
This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.
But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.
I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.
---------
Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
The non-freeze poison argument to select can be one of the following: global,
constant, and noundef arguments.
Alive2 test validation: https://alive2.llvm.org/ce/z/jbtCS6
- **[InstSimplify] Refactor `simplifyWithOpsReplaced` to allow multiple
replacements; NFC**
- **[InstSimplify] Use multi-op replacement when simplify `select`**
In the case of `select X | Y == 0 :...` or `select X & Y == -1 : ...`
we can do more simplifications by trying to replace both `X` and `Y`
with the respective constant at once.
Handles some cases for https://github.com/llvm/llvm-project/pull/121672
more generically.
In the simplifySelectWithEquivalence fold, simplify both operands before
comparing them, instead of comparing one simplified operand with a
non-simplified operand. This is slightly more powerful.
Instead of only trying to constant fold the select arms, try to simplify
them. This subsumes https://github.com/llvm/llvm-project/pull/115969
which implements this for extractvalue only.
This is still fairly limited in that we will usually only call
FoldOpIntoSelect in the first place if we have a constant operand. This
can be relaxed in the future if worthwhile.
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.
However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.
The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.
For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
Consider the following case:
```
%cmp = icmp eq ptr %p, null
%load = load i32, ptr %p, align 4
%sel = select i1 %cmp, i32 %load, i32 0
```
`foldSelectValueEquivalence` converts `load i32, ptr %p, align 4` into
`load i32, ptr null, align 4`, which causes immediate UB. `%load` is
speculatable, but it doesn't hold after operand substitution.
This patch introduces a new helper
`isSafeToSpeculativelyExecuteWithVariableReplaced`. It ignores operand
info in these instructions since their operands will be replaced later.
Fixes#99436.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
It is documented that immarg is only valid on intrinsic declarations,
although the verifier also tolerates it on intrinsic calls.
This patch updates tests that are not specifically testing the
behavior of the IR parser or verifier.
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).
This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
We don't need the `noundef` check if the new simplification is a
constant.
This cleans up regressions from folding multiuse:
`(icmp eq/ne (sub/xor x, y), 0)` -> `(icmp eq/ne x, y)`.
Closes#88298
As the comment already indicates, only replacement with undef
is problematic, as it introduces an additional use of undef.
Use the correct ValueTracking helper.
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of
range metadata for call instructions to range attributes.
We're replacing the select with the false value here, but it may be more
poisonous if m_Not contains poison elements. Fix this by introducing a
m_NotForbidPoison matcher and using it here.
Fixes https://github.com/llvm/llvm-project/issues/89500.
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.
Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.
At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.
I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.
This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
`and/or/xor` operations can each be changed to sum of logical
operations including operators other than themselves.
`x&y -> (x|y) ^ (x^y)`
`x|y -> (x&y) | (x^y)`
`x^y -> (x|y) ^ (x&y)`
if left of condition of `SelectInst` is `and/or/xor` logical
operation and right is equal to `0, -1`, or a `constant`, and
if `TrueVal` consist of `and/or/xor` logical operation then we
can optimize this case.
This patch implements this combination.
Proof: https://alive2.llvm.org/ce/z/WW8iRR
Fixes https://github.com/llvm/llvm-project/issues/71792.
When replacing with a non-constant, it's possible that the result of the
simplification is actually more complicated than the original, and may
result in an infinite combine loop.
Mitigate the issue by requiring that either the replacement or
simplification result is constant, which should ensure that it's
simpler. While this check is crude, it does not appear to cause
optimization regressions in real-world code in practice.
Fixes https://github.com/llvm/llvm-project/issues/83127.
This patch adds support for canonicalization of icmp with a scalable
splat. Some optimizations assume that `icmp pred X, APInt C` is in
canonical form.
Fixes https://github.com/llvm/llvm-project/issues/83931.
This is mostly NFC but some output does change due to consistently
inserting into poison rather than undef and using i64 as the index
type for inserts.
`InstCombine` uses `Worklist` to manage change history. `setOperand`,
which was previously used to change the `Select` Instruction, does not,
so it is `run` twice, which causes an `LLVM ERROR`.
This problem is resolved by changing `setOperand` to `replaceOperand` as
the change history will be registered in the Worklist.
Fixes#77553.
The disjoint flag was recently added to IR in #72583
We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.
When new phi nodes are inserted at the start of the block, the
order of these does not get canonicalized, as we pick the first
phi node to canonicalize towards (and the other phi nodes may have
already been visited). This results in phi nodes not being
deduplicated and thus a fix-point verification failure.
Fix this by remembering the predecessors of the first phi node
we encounter -- this will usually result in the same order we
picked previously. I also considered using predecessors() order
instead, but that causes substantial test fallout. Additionally
the predecessors() order tends to be the reverse of the "natural"
order.
Fixes https://github.com/llvm/llvm-project/issues/46688.