Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
Extend decomposeBitTestICmp() to handle cases where the resulting
comparison is of the form `icmp (X & Mask) pred C` with non-zero
`C`. Add a flag to allow code to opt-in to this behavior and use it in
the "log op of icmp" fold infrastructure.
This addresses regressions from #97289.
Proofs: https://alive2.llvm.org/ce/z/hUhdbU
decomposeBitTestICmp() currently returns the result via two out
parameters plus an in-place modification of Pred. This changes it to
return an optional struct instead.
The motivation here is twofold. First, I'd like to extend this code to
handle cases where the comparison is against a value other than zero,
which would mean yet another out parameter. Second, while doing that I
was badly bitten by the in-place modification, so I'd like to get rid of
it.
Added folds:
- `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)`
- `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`
The fold typically is handled in the `Reassosiate` pass, but it fails
if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate
doesn't propagate flags correctly.
This patch adds the fold explicitly the InstCombine
Proofs: https://alive2.llvm.org/ce/z/p6JyRPCloses#105866
getMaskedTypeForICmpPair() tries to model non-and operands as x & -1.
However, this can end up confusing the matching logic, by picking the -1
operand as the "common" operand, resulting in a successful, but useless,
match. This is what causes commutation failures for some of the
optimizations driven by this function.
Fix this by treating a match against -1 as a non-match.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
Addresses issue #88716.
Some function parameter names in the affected header files did not match
the parameter names in the definitions, or were listed in a different
order.
---------
Signed-off-by: Troy-Butler <squintik@outlook.com>
Since `DivRemPairPass` runs after `ReassociatePass` in the optimization
pipeline, I decided to do this simplification in `InstCombine`.
Alive2: https://alive2.llvm.org/ce/z/JgsiqfFixes#76128.
Now that we don't accept undef splat in PatternMatch, we can remove some
uses of replaceUndefsWith(). I believe in all these cases only poison
splats are possible now, in which case no replacement is necessary.
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.
Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.
At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.
I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.
This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
This patch extends [D36234](https://reviews.llvm.org/D36234) to handle
`zext nneg` instructions.
I found this while adding support for cast instructions in
`getFreelyInvertedImpl`.
The matchFunnelShift function was doing pattern matching and creating
the fshl/fshr instruction if needed. Moved the pattern matching code to
function convertOrOfShiftsToFunnelShift. It can be reused for other
optimizations.
Slightly generalize simplifyAndOrWithOpReplaced() by allowing it to
perform simplifications (without creating new instructions) in multi-use
cases. This way we can remove existing patterns without worrying about
multi-use edge cases.
I've opted to change the general way the implementation works to be more
similar to the standard simplifyWithOpReplaced(). We perform the operand
replacement generically, and then try to simplify the result or create a
new instruction if we're allowed to do so.
This patch canonicalizes the fcmp range check idiom into `fabs + fcmp`
since the canonicalized form is better than the original form for the
backends.
Godbolt: https://godbolt.org/z/x3eqPb1fz
```
and (fcmp olt/ole/ult/ule x, C), (fcmp ogt/oge/ugt/uge x, -C) --> fabs(x) olt/ole/ult/ule C
or (fcmp ogt/oge/ugt/uge x, C), (fcmp olt/ole/ult/ule x, -C) --> fabs(x) ogt/oge/ugt/uge C
```
Alive2: https://alive2.llvm.org/ce/z/MRtoYq
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.
This patch adds fold for the following cases:
`(add/sub/disjoint_or C, (ctpop (not x))`
-> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
-> `(cmp swapped_pred C', (ctpop x))`
Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.
Proofs: https://alive2.llvm.org/ce/z/qUgfF3Closes#77859
This patch tries to canonicalize the pattern `Overflow | icmp pred Res,
C2` into `Overflow | icmp pred X, C2 +/- C1`, where `Overflow` and `Res`
are return values of `xxx.with.overflow X, C1`.
Alive2: https://alive2.llvm.org/ce/z/PhR_3SFixes#75360.
When I originally added this fold, it did not actually fix my
motivation case, where the add was represented as an or. Now that
we have the disjoint flag this can finally be cleanly supported.