1453 Commits

Author SHA1 Message Date
Alexey Bataev
8ff233f4f1 [SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 09:42:08 -07:00
mskamp
b22fa9093b
[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)
Add KnownBits computations to ValueTracking and X86 DAG lowering.
    
These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.
    
There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.
    
Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.
    
Fixes #82516.
2024-07-16 15:50:21 +01:00
Alexey Bataev
c3540d0b6b Revert "[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats."
This reverts commit c7aac38c29f564bc48f7cfb71d3b3b8b482c873b to fix
crashes reavealed by the buildbot in https://lab.llvm.org/buildbot/#/builders/168/builds/1104.
2024-07-16 05:59:59 -07:00
Alexey Bataev
c7aac38c29
[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 08:14:27 -04:00
Nikita Popov
d177a94fbd [IR] Add Constant::toConstantRange() (NFC)
The logic in llvm::getVectorConstantRange() can be a bit
inconvenient to use in some cases because of the need to handle
the scalar case separately. Generalize it to handle all constants,
and move it to live directly on Constant.
2024-07-05 16:51:49 +02:00
Simon Pilgrim
5c204b1d26 [ValueTracking][X86] computeKnownBitsFromOperator - add PMULH/PMULHU intrinsics mulhs/mulhu known bits handling.
These map directly to the KnownBits implementations.
2024-07-04 11:08:06 +01:00
Nikita Popov
ebab105670 [IR] Don't strip through pointer to vector of pointer bitcasts
When using stripPointerCasts() and getUnderlyingObject(), don't
strip through a bitcast from ptr to <1 x ptr>, which is not a
no-op pointer cast. Calling code is generally not prepared to
handle that situation, resulting in incorrect alias analysis
results for example.

Fixes https://github.com/llvm/llvm-project/issues/97600.
2024-07-04 09:47:59 +02:00
Nikita Popov
2dbb454791 [ValueTracking][LVI] Consolidate vector constant range calculation
Add a common helper used for computeConstantRange() and LVI. The
implementation is a mix of both, with the efficient handling for
ConstantDataVector taken from computeConstantRange(), and the
general handling (including non-splat poison) from LVI.
2024-07-03 15:19:26 +02:00
Noah Goldstein
7c96469ea8 [ValueTracking] Extend LHS/RHS with matching operand to work without constants.
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, `X u> Y` implies `X != 0`.

In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Closes #85557
2024-07-03 20:18:51 +08:00
Nikita Popov
b58ae6bd27 [InstCombine] Sync KnownBits logic for select arms
Extract an adjustKnownBitsForSelectArm() helper for the
ValueTracking logic and make use of it in SimplifyDemandedBits().

This fixes a consistency violation under instcombine-verify-known-bits.
2024-07-01 15:52:20 +02:00
Nikita Popov
77eb056830
[InstCombine] Simplify select using KnownBits of condition (#95923)
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).

This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
2024-07-01 09:26:01 +02:00
Matt Arsenault
1d27348e53 ValueTracking: Simplify intrinsic ID asserts 2024-06-30 07:58:51 +02:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Farzon Lotfi
918313d17d
[SLPVectorizer] Support SLPVectorizer cases of tan across all backends (#95517)
This PR is intended to address the limited SLPVectorizer support of tan
raised in the comments of this PR:
https://github.com/llvm/llvm-project/pull/94559.

Right now emitting the tan intrinsisic allows you to vectorize tan, but
emitting the libfunc does not. to address this the libcall needs to be
mapped to the intrinsic. and the libcall and function name need to be
marked approriately so they can be optimized or defined as a call
lowering.
2024-06-27 15:15:13 -04:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Craig Topper
326ba38a99
[ValueTracking][RISCV] Use ConstantRange::getUnsignedMax instead of getUpper to simplify some code. (#96816)
This avoids the need to subtract 1 and explain why.
2024-06-26 17:30:19 -07:00
Poseydon42
ffec31566c
[InstSimplify] Provide information about the range of possible values that ucmp/scmp can return (#96410)
This makes it possible to fold dumb comparisons like `ucmp(x, y) == 7`.
2024-06-24 12:01:46 +08:00
Nikita Popov
6012de2b4e [ValueTracking] Support gep nuw in isKnownNonZero()
gep nuw can be null if and only if both the base pointer and offset
are null. Unlike the inbounds case this does not depend on whether
the null pointer is valid.

Proofs: https://alive2.llvm.org/ce/z/PLoqK5
2024-06-20 12:41:21 +02:00
Zain Jaffal
22ff7c5dc9
[ValueTracking][NFC] move isKnownInversion to ValueTracking (#95321)
I am using `isKnownInversion` in the following pr 
https://github.com/llvm/llvm-project/pull/94915

it is useful to have the method in a shared class so I can reuse it. I am not sure if `ValueTracking` is the correct place but it looks like most of the methods with the pattern `isKnownX` belong there.
2024-06-13 07:14:08 +01:00
c8ef
b25b1db819
[KnownBits] Remove hasConflict() assertions (#94568)
Allow KnownBits to represent "always poison" values via conflict.

close: #94436
2024-06-07 17:01:22 +02:00
Nikita Popov
16e2ec82ac [ValueTracking] Make undef element check more precise
If we're only checking for undef, then also only look for undef
elements in the vector (rather than undef and poison).
2024-06-06 09:39:33 +02:00
Ahmed Bougacha
0edc97f119
[IR][AArch64][PAC] Add "ptrauth(...)" Constant to represent signed pointers. (#85738)
This defines a new kind of IR Constant that represents a ptrauth signed
pointer, as used in AArch64 PAuth.

It allows representing most kinds of signed pointer constants used thus
far in the llvm ptrauth implementations, notably those used in the
Darwin and ELF ABIs being implemented for c/c++.  These signed pointer
constants are then lowered to ELF/MachO relocations.

These can be simply thought of as a constant `llvm.ptrauth.sign`, with
the interesting addition of discriminator computation: the `ptrauth`
constant can also represent a combined blend, when both address and
integer discriminator operands are used.  Both operands are otherwise
optional, with default values 0/null.
2024-05-28 16:39:09 -07:00
Noah Goldstein
2232843160 [ValueTracking] Recognize X op (X != 0) as non-zero
The ops supported are: `add`, `sub`, `xor`, `or`, `umax`, `uadd.sat`

Proofs: https://alive2.llvm.org/ce/z/8ZMSRg

The `add` case actually comes up in SPECInt, the rest are here mostly
for completeness.

Closes #88579
2024-05-20 15:25:40 -05:00
Yingwei Zheng
04ae6e600a
[ValueTracking] Fix incorrect inferrence about the signbit of sqrt (#92510)
According to IEEE Std 754-2019, `sqrt` returns nan when the input is
negative (except for -0). In this case, we cannot make assumptions about
sign bit of the result.

Fixes https://github.com/llvm/llvm-project/issues/92217
2024-05-20 21:52:38 +08:00
Matt Arsenault
0cd2bf3521
ValueTracking: Correct undef handling for constant FP vectors (#92557)
Treat undef as unknown, and poison as ignorable.
2024-05-19 20:56:21 +02:00
Noah Goldstein
05347f8c2f [ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C)
`(icmp ule/ult (add nuw X, Y), C)` implies both `(icmp ule/ult X, C)` and
`(icmp ule/ult Y, C)`. We can use this to deduce leading zeros in `X`/`Y`.

`(icmp uge/ugt (sub nuw X, Y), C)` implies `(icmp uge/uge X, C)` . We
can use this to deduce leading ones in `X`.

Proofs: https://alive2.llvm.org/ce/z/sc5k22

Closes #87180
2024-05-16 13:03:32 -05:00
Yingwei Zheng
46bc54f4e6
[ValueTracking] Fix assertion failure when computeKnownFPClass returns fcNone (#92355)
Fixes
https://github.com/llvm/llvm-project/pull/92084#issuecomment-2114083188.
2024-05-16 19:14:30 +08:00
Yingwei Zheng
3aae916ff7
Reland "[ValueTracking] Compute knownbits from known fp classes" (#92084)
This patch relands https://github.com/llvm/llvm-project/pull/86409.

I mistakenly thought that `Known.makeNegative()` clears the sign bit of
`Known.Zero`. This patch fixes the assertion failure by explicitly
clearing the sign bit.
2024-05-14 18:10:28 +08:00
Martin Storsjö
2e165a2c4b Revert "[ValueTracking] Compute knownbits from known fp classes (#86409)"
This reverts commit d03a1a6e5838c7c2c0836d71507dfdf7840ade49.

This change caused failed assertions, see
https://github.com/llvm/llvm-project/pull/86409#issuecomment-2109469845
for details.
2024-05-14 10:37:23 +03:00
Yingwei Zheng
d03a1a6e58
[ValueTracking] Compute knownbits from known fp classes (#86409)
This patch calculates knownbits from fp instructions/dominating fcmp
conditions. It will enable more optimizations with signbit idioms.
2024-05-14 01:58:45 +08:00
Monad
fd0ffb7438
[ValueTracking] Recognize LShr(UINT_MAX, Y) + 1 as a power-of-two (#91171)
There is a missed optimization in
``` llvm
define i8 @known_power_of_two_rust_next_power_of_two(i8 %x, i8 %y) {
  %2 = add i8 %x, -1
  %3 = tail call i8 @llvm.ctlz.i8(i8 %2, i1 true)
  %4 = lshr i8 -1, %3
  %5 = add i8 %4, 1
  %6 = icmp ugt i8 %x, 1
  %p = select i1 %6, i8 %5, i8 1

  %r = urem i8 %y, %p
  ret i8 %r
}
```
which is extracted from the Rust code
``` rust
fn func(x: usize, y: usize) -> usize {
    let z = x.next_power_of_two();
    y % z
}
```
Here `%p` (a.k.a `z`) is semantically a power-of-two, so `y urem p` can
be optimized to `y & (p - 1)`. (Alive2 proof:
https://alive2.llvm.org/ce/z/H3zooY)

---

It could be generalized to recognizing `LShr(UINT_MAX, Y) + 1` as a
power-of-two, which is what this PR does.
Alive2 proof: https://alive2.llvm.org/ce/z/zUPTbc
2024-05-07 10:28:36 +09:00
Franklin Zhang
6b948705a0
[AggressiveInstCombine] Inline strcmp/strncmp (#89371)
Inline calls to strcmp(s1, s2) and strncmp(s1, s2, N), where N and
exactly one of s1 and s2 are constant.

For example:

```c
int res = strcmp(s, "ab");
```

is converted to

```c
int res = (int)s[0] - (int)'a';
if (res != 0)
  goto END;
res = (int)s[1] - (int)'b';
if (res != 0)
  goto END;
res = (int)s[2] - (int)'\0';
END:
```

Ported from a similar gcc feature [Inline strcmp with small constant
strings](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809).
2024-05-03 13:24:38 +09:00
annamthomas
78270cb81b
[UndefOrPoison] [CompileTime] Avoid IDom walk unless required. NFC (#90092)
If the value is not boolean and we are checking for `Undef` or
`UndefOrPoison`, we can avoid the potentially expensive IDom walk.
    
This should improve compile time for isGuaranteedNotToBeUndefOrPoison
and isGuaranteedNotToBeUndef.
2024-05-01 10:38:22 -04:00
Nikita Popov
3c553fc9e0
[InstCombine] Infer nuw on mul nsw with non-negative operands (#90170)
If a mul nsw has non-negative operands, it's also nuw.

Proof: https://alive2.llvm.org/ce/z/2Dz9Uu

Fixes https://github.com/llvm/llvm-project/issues/90020.
2024-04-29 09:53:09 +09:00
Noah Goldstein
b933c8447b [ValueTracking] Add support for trunc nuw/nsw in isKnowNonZero
With `nsw`/`nuw`, the `trunc` is non-zero if its operand is non-zero.

Proofs: https://alive2.llvm.org/ce/z/iujmk6

Closes #89643
2024-04-24 02:53:11 -05:00
Nikita Popov
a1b1c4a6d1
[InstCombine] Fix miscompile in negation of select (#89698)
Swapping the operands of a select is not valid if one hand is more
poisonous that the other, because the negation zero contains poison
elements.

Fix this by adding an extra parameter to isKnownNegation() to forbid
poison elements.

I've implemented this using manual checks to avoid needing four variants
for the NeedsNSW/AllowPoison combinations. Maybe there is a better way
to do this...

Fixes https://github.com/llvm/llvm-project/issues/89669.
2024-04-24 10:56:26 +09:00
Craig Topper
48324f0f7b
[ValueTracking] Combine variable declaration with its only assignment. NFC (#89526) 2024-04-21 09:57:01 -07:00
Matthias Braun
6bbccd2516
GlobalsModRef, ValueTracking: Look through threadlocal.address intrinsic (#88418)
This improves handling of `threadlocal.address` intrinsic in analyses:

The thread-id cannot change within a function with the exception of
suspend points of pre-split coroutines. This changes
`llvm::getUnderlyingObject` to look through `threadlocal.address` in
these cases.

`GlobalsAAResult::AnalyzeUsesOfPointer` checks whether an address can be
traced to simple loads/stores or escapes to other places. Starting the
analysis from a thread-local `GlobalValue` the `threadlocal.address`
intrinsic is safe to skip here.

This improves issue #87437
2024-04-19 10:01:42 -07:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Andreas Jonson
ff3523f67b
[IR] Drop poison-generating return attributes when necessary (#89138)
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
2024-04-18 15:27:36 +09:00
Noah Goldstein
9eeae44211 [ValueTracking] Implement computeKnownFPClass for llvm.vector.reduce.{fmin,fmax,fmaximum,fminimum}
Closes #88408
2024-04-16 16:10:00 -05:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Matt Arsenault
f78b3466ca
ValueTracking: Treat poison more aggressively in computeKnownFPClass (#87990)
Assume no valid values, and the sign bit is 0.
2024-04-15 12:51:29 +02:00
Nikita Popov
52a1998f15 [ValueTracking] Don't accept undef in isKnownNonZero()
As the undef can be replaced with a zero value, this is not legal
in the general case. We can only allow poison values. This matches
what the other ValueTracking helpers like computeKnownBits() do.
2024-04-15 13:54:09 +09:00
Noah Goldstein
1e16a35fbc [ValueTracking] Implement isKnownNonZero for llvm.vector.reduce.or
Closes #88320
2024-04-14 22:49:06 -05:00
Noah Goldstein
6c71707872 [ValueTracking] Implement computeKnownBits for llvm.vector.reduce.xor 2024-04-14 22:49:06 -05:00
Noah Goldstein
6063e3c408 [ValueTracking] Implement computeKnownBits for llvm.vector.reduce.{or,and} 2024-04-14 22:49:05 -05:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
AtariDreams
5d6b00929b
[NFC] Replace m_Sub(m_Zero(), X) with m_Neg(X) (#88461) 2024-04-12 18:24:03 +09:00
Noah Goldstein
b8659600c3 [ValueTracking] compute knownbits from (icmp upred X (and/or X, Y)); NFC
`(icmp uge/ugt (and X, Y), C)` implies both `(icmp uge/ugt X, C)` and
`(icmp uge/ugt Y, C)`. We can use this to deduce leading ones in `X`.

`(icmp ule/ult (or X, Y), C)` implies both `(icmp ule/ult X, C)` and
`(icmp ule/ult Y, C)`. We can use this to deduce leading zeros in `X`.

Closes #86059
2024-04-11 12:27:10 -05:00