This addresses an optimization regression in Rust we have observed after
https://github.com/llvm/llvm-project/pull/82458. We now only perform
pointer replacement if they have the same underlying object. However,
getUnderlyingObject() by default only looks through linear chains, not
selects/phis. In particular, this means that we miss cases involving
involving pointer induction variables.
This patch fixes this by introducing a new helper
getUnderlyingObjectAggressive() which basically does what
getUnderlyingObjects() does, just specialized to the case where we must
arrive at a single underlying object in the end, and with a limit on the
number of inspected values.
Doing this more expensive underlying object check has no measurable
compile-time impact on CTMark.
Consider the following case:
```
%cmp = icmp eq ptr %p, null
%load = load i32, ptr %p, align 4
%sel = select i1 %cmp, i32 %load, i32 0
```
`foldSelectValueEquivalence` converts `load i32, ptr %p, align 4` into
`load i32, ptr null, align 4`, which causes immediate UB. `%load` is
speculatable, but it doesn't hold after operand substitution.
This patch introduces a new helper
`isSafeToSpeculativelyExecuteWithVariableReplaced`. It ignores operand
info in these instructions since their operands will be replaced later.
Fixes#99436.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
Add simple support for looking through a zext when doing
ComputeKnownSignBits for shl. This is valid for the case when
all extended bits are shifted out, because then the number of sign
bits can be found by analysing the zext operand.
The solution here is simple as it only handle a single zext (not
passing remaining left shift amount during recursion). It could be
possible to generalize this in the future by for example passing an
'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to
calculate number of sign bits starting at some offset from the most
significant bit.
`llvm.vector.reverse` preserves each of the elements and thus elements
common to them.
Alive2 doesn't support the intrin yet, but the logic seems pretty
self-evident.
Closes#99013
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.
Reviewers: nikic, RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/98570
Add KnownBits computations to ValueTracking and X86 DAG lowering.
These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.
There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.
Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.
Fixes#82516.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.
Reviewers: nikic, RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/98570
The logic in llvm::getVectorConstantRange() can be a bit
inconvenient to use in some cases because of the need to handle
the scalar case separately. Generalize it to handle all constants,
and move it to live directly on Constant.
When using stripPointerCasts() and getUnderlyingObject(), don't
strip through a bitcast from ptr to <1 x ptr>, which is not a
no-op pointer cast. Calling code is generally not prepared to
handle that situation, resulting in incorrect alias analysis
results for example.
Fixes https://github.com/llvm/llvm-project/issues/97600.
Add a common helper used for computeConstantRange() and LVI. The
implementation is a mix of both, with the efficient handling for
ConstantDataVector taken from computeConstantRange(), and the
general handling (including non-splat poison) from LVI.
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.
We can get more out of the analysis using general constant ranges
instead.
For example, `X u> Y` implies `X != 0`.
In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.
Closes#85557
Extract an adjustKnownBitsForSelectArm() helper for the
ValueTracking logic and make use of it in SimplifyDemandedBits().
This fixes a consistency violation under instcombine-verify-known-bits.
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).
This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
This PR is intended to address the limited SLPVectorizer support of tan
raised in the comments of this PR:
https://github.com/llvm/llvm-project/pull/94559.
Right now emitting the tan intrinsisic allows you to vectorize tan, but
emitting the libfunc does not. to address this the libcall needs to be
mapped to the intrinsic. and the libcall and function name need to be
marked approriately so they can be optimized or defined as a call
lowering.
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
gep nuw can be null if and only if both the base pointer and offset
are null. Unlike the inbounds case this does not depend on whether
the null pointer is valid.
Proofs: https://alive2.llvm.org/ce/z/PLoqK5
I am using `isKnownInversion` in the following pr
https://github.com/llvm/llvm-project/pull/94915
it is useful to have the method in a shared class so I can reuse it. I am not sure if `ValueTracking` is the correct place but it looks like most of the methods with the pattern `isKnownX` belong there.
This defines a new kind of IR Constant that represents a ptrauth signed
pointer, as used in AArch64 PAuth.
It allows representing most kinds of signed pointer constants used thus
far in the llvm ptrauth implementations, notably those used in the
Darwin and ELF ABIs being implemented for c/c++. These signed pointer
constants are then lowered to ELF/MachO relocations.
These can be simply thought of as a constant `llvm.ptrauth.sign`, with
the interesting addition of discriminator computation: the `ptrauth`
constant can also represent a combined blend, when both address and
integer discriminator operands are used. Both operands are otherwise
optional, with default values 0/null.
The ops supported are: `add`, `sub`, `xor`, `or`, `umax`, `uadd.sat`
Proofs: https://alive2.llvm.org/ce/z/8ZMSRg
The `add` case actually comes up in SPECInt, the rest are here mostly
for completeness.
Closes#88579
According to IEEE Std 754-2019, `sqrt` returns nan when the input is
negative (except for -0). In this case, we cannot make assumptions about
sign bit of the result.
Fixes https://github.com/llvm/llvm-project/issues/92217
`(icmp ule/ult (add nuw X, Y), C)` implies both `(icmp ule/ult X, C)` and
`(icmp ule/ult Y, C)`. We can use this to deduce leading zeros in `X`/`Y`.
`(icmp uge/ugt (sub nuw X, Y), C)` implies `(icmp uge/uge X, C)` . We
can use this to deduce leading ones in `X`.
Proofs: https://alive2.llvm.org/ce/z/sc5k22Closes#87180
This patch relands https://github.com/llvm/llvm-project/pull/86409.
I mistakenly thought that `Known.makeNegative()` clears the sign bit of
`Known.Zero`. This patch fixes the assertion failure by explicitly
clearing the sign bit.
There is a missed optimization in
``` llvm
define i8 @known_power_of_two_rust_next_power_of_two(i8 %x, i8 %y) {
%2 = add i8 %x, -1
%3 = tail call i8 @llvm.ctlz.i8(i8 %2, i1 true)
%4 = lshr i8 -1, %3
%5 = add i8 %4, 1
%6 = icmp ugt i8 %x, 1
%p = select i1 %6, i8 %5, i8 1
%r = urem i8 %y, %p
ret i8 %r
}
```
which is extracted from the Rust code
``` rust
fn func(x: usize, y: usize) -> usize {
let z = x.next_power_of_two();
y % z
}
```
Here `%p` (a.k.a `z`) is semantically a power-of-two, so `y urem p` can
be optimized to `y & (p - 1)`. (Alive2 proof:
https://alive2.llvm.org/ce/z/H3zooY)
---
It could be generalized to recognizing `LShr(UINT_MAX, Y) + 1` as a
power-of-two, which is what this PR does.
Alive2 proof: https://alive2.llvm.org/ce/z/zUPTbc