Removing debug-intrinsics requires that we always insert with an
iterator, not with an instruction position. To enforce that, we need to
eliminate the `Instruction *` taking functions. It's safe to leave the
insert-at-end-of-block functions as the intention is clear for debug
info purposes (i.e., insert after both instructions and debug-info at
the end of the function).
This patch demonstrates how that needs to happen. At a variety of
call-sites to the `CreateNeg` constructor we need to consider:
* Has this instruction been selected because of the operation it
performs? In that case, just call `getIterator` and pass an iterator in.
* Has this instruction been selected because of it's position? If so, we
need to keep the iterator identifying that position (see the 3rd hunk
changing Reassociate.cpp, although it's coincidentally not debug-info
significant).
This also demonstrates what we'll try and do with the constructor
methods going forwards: have one fully explicit set of parameters
including iterator, and another with default-arguments where the
block-to-insert-into argument defaults to nullptr / no-position,
creating an instruction that hasn't been inserted yet.
This reverts commit a034e65e972175a2465deacb8c78bc7efc99bd23.
Some protobuf users reported that this patch caused a significant
compile-time regression because `TailDuplicator` works poorly with a
specific pattern.
We will reland it once the codegen issue is fixed.
We have a bunch of places where we have to guard against undef
to avoid multi-use issues, but would be fine with poison. Use a
different function for these to make it clear, and to indicate that
this check can be removed once we no longer support undef. I've
replaced some of the obvious cases, but there's probably more.
For now, the implementation is the same as UndefOrPoison, it just
has a more precise name.
CVP currently tries to fold load/store pointer operands to constants
using LVI. If there is a dominating condition of the form `icmp eq ptr
%p, @g`, then `%p` will be replaced with `@g`.
LVI is geared towards range-based optimizations, and is *very*
inefficient at handling simple pointer equality conditions. We have
other passes that can handle this optimization in a more efficient way,
such as IPSCCP and GVN.
Removing this optimization gives a geomean 0.4-1.2% compile-time
improvement depending on configuration. At the same time, there
is no impact on codegen.
We may be able to replace the select with one of its select arms
at some but not all uses.
For uses in phi nodes this was already handled in the getValueOnEdge()
fold (which we can't drop entirely because it also handles an additional
case).
Fixes https://github.com/llvm/llvm-project/issues/63756.
In this case we don't need to emit the comparison and select.
This is papering over a weakness in CVP in that newly added
instructions don't get revisited. If they were revisited, the
icmp would be folded at that point.
However, even without that it makes sense to handle this explicitly,
because it avoids the need to insert freeze, which may prevent
further analysis of the operation by LVI.
Proofs: https://alive2.llvm.org/ce/z/quyBxp
Fixes https://github.com/llvm/llvm-project/issues/63330.
DFAJumpThreading
JumpThreading
LibCallsShrink
LoopVectorize
SLPVectorizer
DeadStoreElimination
AggressiveDCE
CorrelatedValuePropagation
IndVarSimplify
These are part of the optimization pipeline, of which the legacy version is deprecated and being removed.
As a side-effect, this switchem them to use getConstantRange() rather
than getPredicateAt(). getPredicateAt() is not supposed to be more
powerful than getConstantRange() for non-equality comparisons (as
long as block values are used).
When an instruction is only used in a select or phi operand, we might
be able to make use of additional information from the select/branch
condition. For example in
%sub = call i16 @llvm.usub.sat.i16(i16 %x, i16 10)
%cmp = icmp uge i16 %x, 10
%sel = select i1 %cmp, i16 %sub, i16 42
the usub.sat is only used in a select where %x uge 10 is known to
hold, so we can fold it based on that knowledge.
This addresses the regression reported at
https://reviews.llvm.org/D140798#4039748, but also provides a
solution to a recurring problem we've had, where we fail to make
use of range information after a branch+phi has been converted
into a select. Our current solution to this is to hope that IPSCCP
can perform the fold before that happens, but handling this in LVI
is a somewhat more general solution.
Currently we only make use of this for the willNotOverflow() fold,
but I plan to adjust other folds to use the new API as well.
Differential Revision: https://reviews.llvm.org/D141482
For `srem x, y`, if abs(constant range of x) less than abs(constant
range of y), we can simplify it as:
`srem x, y => x` if y is guaranteed to be positive.
'srem x, y => -x' if y is guaranteed to be negative.
Differential Revision: https://reviews.llvm.org/D140405
As per the post-commit feedback - that was not the correct precondition
to avoid it here. I think we should generally start changing mentality
about `freeze`, the fact that we have been conditioned to be afraid of it
(or of anything in LLVM in general) is the key problem here.
This kind of thing happens really frequently in LLVM's very own
shuffle combining methods, and it is even considered bad practice
to use `%` there, instead of using this expansion directly.
Though, many of the cases there have variable divisors,
so this won't help everything.
Simple case: https://alive2.llvm.org/ce/z/PjvYf-
There's alternative expansion via `umin`:
https://alive2.llvm.org/ce/z/hWCVPb
BUT while we can transform the first expansion
into the `umin` one (e.g. for SCEV):
https://alive2.llvm.org/ce/z/iNxKmJ
... we can't go in the opposite direction.
Also, the non-`umin` expansion seems somewhat more codegen-friendly:
https://godbolt.org/z/qzjx5bqWKhttps://godbolt.org/z/a7bj1axbx
There's second variant of precondition:
https://alive2.llvm.org/ce/z/zE6cbM
but there the numerator must be non-undef / must be frozen.
CVP currently only tries to simplify comparisons if there is a
constant operand. However, even if both are non-constant, we may
be able to determine the result of the comparison based on range
information.
IPSCCP is already capable of doing this, but because it runs very
early, it may miss some cases.
Differential Revision: https://reviews.llvm.org/D137253
This patch fixes an issue in which CorrelatedValuePropagation::processSRem
would create new instructions to represent the SRem instruction, but would not
correctly copy any existing debug location metadata to the new instruction.
Differential Revision: https://reviews.llvm.org/D132218
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName". This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.
This is the alternative to the less invasive clang-format only patch: D126783
Reviewed By: spatel, rengolin
Differential Revision: https://reviews.llvm.org/D126889
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
Previously we took the old name and always appended a numberic suffix.
Since we're doing a 1:1 replacement, it's clearer to keep the original
name exactly.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D125281