Canonicalize GEP of GEP by swapping GEP with some suffix constant indices to the back (and GEP with all constant indices to the back of that), this allows more constant index GEP merging to happen. Exceptions are: If swapping violates use-def relations, or anti-optimizes LICM
For constant indexed GEP of GEP, if they cannot be merged directly, they will be casted to i8* and merged.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D125845
Relative to the previous attempt, this is rebased over the
InstSimplify fix in ac74e7a7806480a000c9a3502405c3dedd8810de,
which addresses the miscompile reported in PR58401.
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
Differential Revision: https://reviews.llvm.org/D134954
Relative to the previous attempt, this adjusts simplification to
use the correct context instruction: We need to use the terminator
of the incoming block, not the original instruction.
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
Differential Revision: https://reviews.llvm.org/D134954
The infinite loop seen on buildbots should be fixed by
11897708c0229c92802e747564e7c34b722f045f (assuming there are not
multiple infinite combine loops...)
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
Differential Revision: https://reviews.llvm.org/D134954
Reapply with a fix for the case where an operand simplified back
to the original phi: We need to map this case to the new phi node.
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
The LLVM performance tips suggest that allocas should be placed at the
beginning of the entry block. So far, llvm doesn’t provide any helper to
find that position.
Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*)
that get an insert position after the (static) allocas at the start of a
function and use it in ShadowStackGCLowering.
Differential Revision: https://reviews.llvm.org/D132554
This transform came up as a potential DAGCombine in D133282,
so I wanted to see how it escaped in IR too.
We do general folds in InstCombiner::SimplifySelectsFeedingBinaryOp()
by checking if either arm of a select simplifies when the trailing
binop is threaded into the select.
So as long as one side simplifies, it's a good fold to combine a
negate and add into 1 subtract.
This is an example with a zero arm in the select:
https://alive2.llvm.org/ce/z/Hgu_Tj
And this models the tests with a cancelling 'not' op:
https://alive2.llvm.org/ce/z/BuzVV_
Differential Revision: https://reviews.llvm.org/D133369
When we do extractvalue (any_mul_with_overflow X, -1) --> (-X and icmp),
which left partly failed to match vector constant with poison element.
This patch try to fix it.
Alive2: https://alive2.llvm.org/ce/z/2rGp_3
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D132996
The stronger one-use checks prevented transforms like this:
(x * y) + x --> x * (y + 1)
(x * y) - x --> x * (y - 1)
https://alive2.llvm.org/ce/z/eMhvQa
This is one of the IR transforms suggested in issue #57255.
This should be better in IR because it removes a use of a
variable operand (we already fold the case with a constant
multiply operand).
The backend should be able to re-distribute the multiply if
that's better for the target.
Differential Revision: https://reviews.llvm.org/D132412
The code would use first non-phi instruction as an insertion point, however
this could lead to freeze getting inserted between phi and landingpad
causing a verifier assert.
Differential Revision: https://reviews.llvm.org/D132105
Replace the value-accepting isReallocLikeFn() overload with a
getReallocatedOperand() function, which returns which operand is
the one being reallocated. Currently, this is always the first one,
but once allockind(realloc) is respected, the reallocated operand
will be determined by the allocptr parameter attribute.
Use getFreedOperand() instead of isFreeCall() to remove the
implicit assumption that any pointer operand to a free function
is the operand being freed. This won't actually matter until we
handle allockind(free).
We currently assume in a number of places that free-like functions
free their first argument. This is true for all hardcoded free-like
functions, but with the new attribute-based design, the freed
argument is supposed to be indicated by the allocptr attribute.
To make sure we handle this correctly once allockind(free) is
respected, add a getFreedOperand() helper which returns the freed
argument, rather than just indicating whether the call frees *some*
argument.
This migrates most but not all users of isFreeCall() to the new
API. The remaining users are a bit more tricky.
When folding a binop into a select, we need to ensure that one
of the select arms actually does constant fold, otherwise we'll
create two binop instructions and perform the reverse transform.
Ensure this by performing an explicit constant folding attempt,
and failing the transform if neither side simplifies.
A simple alternative here would have been to limit the fold to
ImmConstants, but given the current representation of scalable
vector splats, this wouldn't be ideal.
Avoid calling ConstantExpr::get() for associative/commutative
binops, call ConstantFoldBinaryOpOperands() instead. We only
want to perform the reassociation of the constants actually fold.
When merging GEP of GEP with constant indices, if the second GEP's offset is not divisible by the first GEP's element size, convert both type to i8* and merge.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D125934
We really want to push freezes through recurrence phis, so that we
freeze only the start value, rather than the IV value on every
iteration. foldOpIntoPhi() already handles this for the case where
the transfer function doesn't produce poison, e.g.
%iv.next = add %iv, 1. However, this does not work if nowrap flags
are present, e.g. the very common %iv.next = add nuw %iv, 1 case.
This patch adds a fold that pushes freeze instructions to the start
value by checking whether all backedge values will be non-poison
after poison generating flags have been dropped. This allows pushing
freezes out of loops in most cases. I suspect that this also
obsoletes the CanonicalizeFreezeInLoops pass, and we can probably
drop it.
Fixes https://github.com/llvm/llvm-project/issues/56048.
Differential Revision: https://reviews.llvm.org/D127960
When pushing an operation across a phi node, we should avoid doing
so across a loop backedge. This is generally non-profitable, because
it does not reduce the number of times the operation is executed,
and could lead to an infinite combine loop.
The code was already guarding against this, but using an
insufficiently strong condition, which did not cover the case where
the operation was originally outside the loop (in which case the
transform moves the operation from outside the loop into the loop,
which is particularly undesirable).
Differential Revision: https://reviews.llvm.org/D127499
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName". This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.
This is the alternative to the less invasive clang-format only patch: D126783
Reviewed By: spatel, rengolin
Differential Revision: https://reviews.llvm.org/D126889
When reassociating GEPs, we can only keep inbounds if both original
GEPs were inbounds, and their offsets have the same sign. For the
sake of simplicity, I only handle the case where both offsets are
non-negative here.
It would probably be fine to just not preserve inbounds at all here,
but as I don't see a compile-time impact for adding the
isKnownNonNegative() calls I went with this more conservative
approach.
Fixes https://github.com/llvm/llvm-project/issues/44206.
Differential Revision: https://reviews.llvm.org/D126687
Even if the total offset is inbounds, we might represent it by first
performing a large negative offset and then a small positive one.
With inbounds semantics as currently specified, each offset must
be inbounds individually, not just the overall offset of the GEP.
Fix this by checking that the sign of all offsets is the same.
Fixes https://github.com/llvm/llvm-project/issues/55722.
If only one of the GEPs is inbounds, then after swapping, there is
no guarantee that one of them will be inbounds as well
(see e.g. https://alive2.llvm.org/ce/z/agaCnp).
This is only a partial fix, because even if both are inbounds, the
result is not necessarily inbounds (if the offsets have different
signs).
As the long explanatory comment attests, performing the modification
in place is pretty tricky. Drop this unnecessary complexity and
always create new instructions.
This should be NFC-ish, but can probably cause difference due to
worklist order.
Use IRBuilder so that the newly created freeze instructions
automatically gets inserted back into the IC worklist.
The changed worklist processing order leads to some cosmetic
differences in tests.
Fixes https://github.com/llvm/llvm-project/issues/55619.
We commonly want to create either an inbounds or non-inbounds GEP
based on a boolean value, e.g. when preserving inbounds from
existing GEPs. Directly accept such a boolean in the API, rather
than requiring a ternary between CreateGEP and CreateInBoundsGEP.
This change is not entirely NFC, because we now preserve an
inbounds flag in a constant expression edge-case in InstCombine.
If there is a freeze %x, we currently replace all other uses of %x
with freeze %x -- as long as they are dominated by the freeze
instruction. This patch extends this behavior to cases where we
did not originally dominate the use by moving the freeze
instruction directly after the definition of the frozen value.
The motivation can be seen in test @combine_and_after_freezing_uses:
Canonicalizing everything to freeze %x allows folds that are based
on value identity (i.e. same operand occurring in two places) to
trigger. This also covers the case from D125248.
Differential Revision: https://reviews.llvm.org/D125321
Currently, two GEPs will only be combined if the result element
type of one is the same as the source element type of the other.
However, this means we may miss folding opportunities where the
second GEP could be rewritten using a different element type. This
is especially relevant for opaque pointers, where constant GEPs
often use i8 element type.
Address this by converting GEP indices to offsets, adding them,
and then converting them back to indices. The first (inner) GEP
is allowed to have variable indices as well, in which case only
the constant suffix is converted into an offset.
This should address the regression reported in
https://reviews.llvm.org/D123300#3467615.
Differential Revision: https://reviews.llvm.org/D124459
We can always replace the undef elements in a vector constant
with regular constants to get rid of the freeze:
https://alive2.llvm.org/ce/z/nfRb4F
The select diffs show that we might do better by adjusting the
logic for a frozen select condition. We may also want to refine
the vector constant replacement to consider forming a splat.
Differential Revision: https://reviews.llvm.org/D123962
The description was ambiguous about the behavior
when boths select arms are constant or both arms
are not constant. I don't think there's any
evidence to support either way, but this matches
the code with a more specified description.
We can extend this to deal with vector constants
with undef/poison elements. Currently, those don't
get folded anywhere.
It actually implements support for seeing through loads, using alias analysis to
refine the result.
This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.
Differential Revision: https://reviews.llvm.org/D122431