1587 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
bdf241cab3
ValueTracking: handle more ops in isNotCrossLaneOperation (#112183)
Reuse llvm::isTriviallyVectorizable in llvm::isNotCrossLaneOperation, in
order to get it to handle more intrinsics.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/XSV_GT
2024-10-14 14:08:12 +01:00
Ramkumar Ramachandra
c5f82f7893
ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-14 11:37:30 +01:00
Ramkumar Ramachandra
78089d5845
ValueTracking: refactor recurrence-matching (NFC) (#109659) 2024-10-04 17:45:29 +01:00
Florian Hahn
dce5bf8efc
[ValueTracking] AllowEphemerals for alignment assumptions. (#108632)
Allow AllowEphemerals in isValidAssumeForContext, as the CxtI might
be the producer of the pointer in the bundle. At the moment, align
assumptions aren't optimized away.

This allows using the assumption in the computeKnownBits call in
getConstantMultipleImpl.

We could extend the computeKnownBits API to allow callers to specify if
ephemerals are allowed, if the info from computeKnownBitsFromContext is
used to remove alignment assumptions.

PR: https://github.com/llvm/llvm-project/pull/108632
2024-10-03 16:02:34 +01:00
Nikita Popov
eb85285727
[ValueTracking] mul nuw nsw with factor sgt 1 is non-negative (#110803)
Proof: https://alive2.llvm.org/ce/z/bC0eJf
2024-10-02 15:16:56 +02:00
Ramkumar Ramachandra
d432e22b2f
ValueTracking: strip stray break in recur-match (#109794)
There is a stray break statement in the recurrence-handling code in
computeKnownBitsFromOperator, that seems to be unintended. Strip this
statement so that we have the opportunity to go through the rest of
phi-handling code, and refine KnownBits further.
2024-10-02 11:18:48 +01:00
Nikita Popov
dd599e92a6
[ValueTracking] Support assume in entry block without DT (#109264)
isValidAssumeForContext() handles a couple of trivial cases even if no
dominator tree is available. This adds one more for the case where there
is an assume in the entry block, and a use in some other block. The
entry block always dominates all blocks.

As having context instruction but not having DT is fairly rare, there is
not much impact. Only test change is in assume-builder.ll, where less
redundant assumes are generated. I've found having this special case is
useful for an upcoming change though.
2024-09-19 14:24:55 +02:00
Yingwei Zheng
2ca75df1d1
[ValueTracking] Infer is-power-of-2 from dominating conditions (#107994)
Addresses downstream rustc issue:
https://github.com/rust-lang/rust/issues/129795
2024-09-13 08:54:29 +08:00
Yingwei Zheng
ffcff4af59
[ValueTracking] Infer is-power-of-2 from assumptions. (#107745)
This patch tries to infer is-power-of-2 from assumptions. I don't see
that this kind of assumption exists in my dataset.
Related issue: https://github.com/rust-lang/rust/issues/129795

Close https://github.com/llvm/llvm-project/issues/58996.
2024-09-10 10:38:21 +08:00
Nikita Popov
9707b98e57 [ConstantRange] Perform increment on APInt (NFC)
This handles the edge case where BitWidth is 1 and doing the
increment gets a value that's not valid in that width, while we
just want wrap-around.

Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 16:11:00 +02:00
Simon Pilgrim
6c8746b6e3
[Analysis] getIntrinsicForCallSite - add vectorization support for acos/asin/atan and cosh/sinh/tanh libcalls (#106844)
Followup to #106584 - ensure acos/asin/atan and cosh/sinh/tanh libcalls correctly map to the llvm intrinsic equivalents
2024-09-03 10:05:56 +01:00
Alex MacLean
369d8148e0
[ValueTracking] use KnownBits to compute fpclass from bitcast (#97762)
When we encounter a bitcast from an integer type we can use the
information from `KnownBits` to glean some information about the
fpclass:
- If the sign bit is known, we can transfer this information over. 
- If the float is IEEE format and enough of the bits are known, we may
  be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.
2024-08-30 07:34:49 -07:00
Noah Goldstein
42ce62800d [ValueTracking] Handle incompatible types instead of asserting in isKnownNonEqual; NFC
Downstream hit this assert, since it doesn't really make any
difference, just change code to return false.
2024-08-19 15:48:45 -07:00
Kazu Hirata
217f5804ca
[Analysis] Use a range-based for loop (NFC) (#104445) 2024-08-15 17:59:23 -07:00
Nikita Popov
afa0f53f96 [ValueTracking] Fix f16 fptosi range for large integers
We were missing the signed flag on the negative value, so the
range was incorrectly interpreted for integers larger than 64-bit.

Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-15 18:18:19 +02:00
Simon Pilgrim
11ba72e651
[KnownBits] Add KnownBits::add and KnownBits::sub helper wrappers. (#99468) 2024-08-12 10:21:28 +01:00
zhongyunde 00443407
2bd568fecc [ValueTracking] Infer relationship for the select with SLT 2024-08-06 10:30:04 +08:00
zhongyunde 00443407
3023713014 [ValueTracking] Infer relationship for the select with ICmp
x -nsw y < -C is false when x > y and C >= 0
Alive2 proof for sgt, sge : https://alive2.llvm.org/ce/z/tupvfi
Note: It only really makes sense in the context of signed comparison for
      "X - Y must be positive if X >= Y and no overflow".

Fixes https://github.com/llvm/llvm-project/issues/54735
2024-08-06 10:30:03 +08:00
Kazu Hirata
7df9da7d78
[llvm] Construct SmallVector with ArrayRef (NFC) (#101872) 2024-08-04 08:54:23 -07:00
Vitaly Buka
945dd9a740
[NFC][Load] Find better place for mustSuppressSpeculation (#100794)
And extract `suppressSpeculativeLoadForSanitizers`.

For #100639.
2024-07-29 10:29:02 -07:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Ramkumar Ramachandra
2754c083cb
LAA: mark LoopInfo pointer const (NFC) (#100373) 2024-07-24 16:52:11 +01:00
Yingwei Zheng
0c03b4ce10
[InstCombine] Infer sub nuw from dominating conditions (#100164)
Alive2: https://alive2.llvm.org/ce/z/g3xxnM
2024-07-24 23:39:30 +08:00
Yingwei Zheng
59eae919c9
[ValueTracking] Don't use CondContext in dataflow analysis of phi nodes (#100316)
See the following case:
```
define i16 @pr100298() {
entry:
  br label %for.inc

for.inc:
  %indvar = phi i32 [ -15, %entry ], [ %mask, %for.inc ]
  %add = add nsw i32 %indvar, 9
  %mask = and i32 %add, 65535
  %cmp1 = icmp ugt i32 %mask, 5
  br i1 %cmp1, label %for.inc, label %for.end

for.end:
  %conv = trunc i32 %add to i16
  %cmp2 = icmp ugt i32 %mask, 3
  %shl = shl nuw i16 %conv, 14
  %res = select i1 %cmp2, i16 %conv, i16 %shl
  ret i16 %res
}
```

When computing knownbits of `%shl` with `%cmp2=false`, we cannot use
this condition in the analysis of `%mask (%for.inc -> %for.inc)`.
 
Fixes https://github.com/llvm/llvm-project/issues/100298.
2024-07-24 20:06:36 +08:00
Nikita Popov
32cd18975d
[GVN] Look through select/phi when determining underlying object (#99509)
This addresses an optimization regression in Rust we have observed after
https://github.com/llvm/llvm-project/pull/82458. We now only perform
pointer replacement if they have the same underlying object. However,
getUnderlyingObject() by default only looks through linear chains, not
selects/phis. In particular, this means that we miss cases involving
involving pointer induction variables.

This patch fixes this by introducing a new helper
getUnderlyingObjectAggressive() which basically does what
getUnderlyingObjects() does, just specialized to the case where we must
arrive at a single underlying object in the end, and with a limit on the
number of inspected values.

Doing this more expensive underlying object check has no measurable
compile-time impact on CTMark.
2024-07-22 16:22:01 +02:00
AtariDreams
56ad7cc012
[IR] Remove non-canonical matchings (#96763) 2024-07-22 09:47:37 +02:00
Yingwei Zheng
248fcab2fc
[InstCombine] Do not use operand info in replaceInInstruction (#99492)
Consider the following case:
```
%cmp = icmp eq ptr %p, null
%load = load i32, ptr %p, align 4
%sel = select i1 %cmp, i32 %load, i32 0
```
`foldSelectValueEquivalence` converts `load i32, ptr %p, align 4` into
`load i32, ptr null, align 4`, which causes immediate UB. `%load` is
speculatable, but it doesn't hold after operand substitution.

This patch introduces a new helper
`isSafeToSpeculativelyExecuteWithVariableReplaced`. It ignores operand
info in these instructions since their operands will be replaced later.

Fixes #99436.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-07-22 11:59:54 +08:00
Bjorn Pettersson
098bd842a7 [ValueTracking] Let ComputeKnownSignBits handle (shl (zext X), C) (#97693)
Add simple support for looking through a zext when doing
ComputeKnownSignBits for shl. This is valid for the case when
all extended bits are shifted out, because then the number of sign
bits can be found by analysing the zext operand.

The solution here is simple as it only handle a single zext (not
passing remaining left shift amount during recursion). It could be
possible to generalize this in the future by for example passing an
'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to
calculate number of sign bits starting at some offset from the most
significant bit.
2024-07-19 12:44:47 +02:00
Noah Goldstein
c6144cb0de [ValueTracking] Remove unnecessary m_ElementWiseBitCast from isKnownNonZeroFromOperator; NFC 2024-07-18 17:02:13 +08:00
Noah Goldstein
0589762e4e [ValueTracking] Consistently propagate DemandedElts is computeKnownFPClass
Closes #99080
2024-07-18 16:38:00 +08:00
Noah Goldstein
e8eeda8e4d [ValueTracking] Consistently propagate DemandedElts is ComputeNumSignBits 2024-07-18 16:38:00 +08:00
Noah Goldstein
72ff0499bb [ValueTracking] Consistently propagate DemandedElts is isKnownNonZero 2024-07-18 16:38:00 +08:00
Noah Goldstein
6ef970b65f [ValueTracking] Consistently propagate DemandedElts is computeKnownBits 2024-07-18 16:38:00 +08:00
Noah Goldstein
769952d72f [ValueTracking] Implement Known{Bits,NonZero,FPClass} for llvm.vector.reverse
`llvm.vector.reverse` preserves each of the elements and thus elements
common to them.

Alive2 doesn't support the intrin yet, but the logic seems pretty
self-evident.

Closes #99013
2024-07-17 02:48:09 +08:00
Alexey Bataev
8ff233f4f1 [SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 09:42:08 -07:00
mskamp
b22fa9093b
[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)
Add KnownBits computations to ValueTracking and X86 DAG lowering.
    
These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.
    
There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.
    
Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.
    
Fixes #82516.
2024-07-16 15:50:21 +01:00
Alexey Bataev
c3540d0b6b Revert "[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats."
This reverts commit c7aac38c29f564bc48f7cfb71d3b3b8b482c873b to fix
crashes reavealed by the buildbot in https://lab.llvm.org/buildbot/#/builders/168/builds/1104.
2024-07-16 05:59:59 -07:00
Alexey Bataev
c7aac38c29
[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on floats.
The patch enables detection of minnum/maxnum patterns for float point
instruction, represented as select/cmp. Also, enables better cost
estimation for integer min/max patterns since the compiler starts
to estimate the scalars separately.

Reviewers: nikic, RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/98570
2024-07-16 08:14:27 -04:00
Nikita Popov
d177a94fbd [IR] Add Constant::toConstantRange() (NFC)
The logic in llvm::getVectorConstantRange() can be a bit
inconvenient to use in some cases because of the need to handle
the scalar case separately. Generalize it to handle all constants,
and move it to live directly on Constant.
2024-07-05 16:51:49 +02:00
Simon Pilgrim
5c204b1d26 [ValueTracking][X86] computeKnownBitsFromOperator - add PMULH/PMULHU intrinsics mulhs/mulhu known bits handling.
These map directly to the KnownBits implementations.
2024-07-04 11:08:06 +01:00
Nikita Popov
ebab105670 [IR] Don't strip through pointer to vector of pointer bitcasts
When using stripPointerCasts() and getUnderlyingObject(), don't
strip through a bitcast from ptr to <1 x ptr>, which is not a
no-op pointer cast. Calling code is generally not prepared to
handle that situation, resulting in incorrect alias analysis
results for example.

Fixes https://github.com/llvm/llvm-project/issues/97600.
2024-07-04 09:47:59 +02:00
Nikita Popov
2dbb454791 [ValueTracking][LVI] Consolidate vector constant range calculation
Add a common helper used for computeConstantRange() and LVI. The
implementation is a mix of both, with the efficient handling for
ConstantDataVector taken from computeConstantRange(), and the
general handling (including non-splat poison) from LVI.
2024-07-03 15:19:26 +02:00
Noah Goldstein
7c96469ea8 [ValueTracking] Extend LHS/RHS with matching operand to work without constants.
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, `X u> Y` implies `X != 0`.

In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Closes #85557
2024-07-03 20:18:51 +08:00
Nikita Popov
b58ae6bd27 [InstCombine] Sync KnownBits logic for select arms
Extract an adjustKnownBitsForSelectArm() helper for the
ValueTracking logic and make use of it in SimplifyDemandedBits().

This fixes a consistency violation under instcombine-verify-known-bits.
2024-07-01 15:52:20 +02:00
Nikita Popov
77eb056830
[InstCombine] Simplify select using KnownBits of condition (#95923)
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).

This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
2024-07-01 09:26:01 +02:00
Matt Arsenault
1d27348e53 ValueTracking: Simplify intrinsic ID asserts 2024-06-30 07:58:51 +02:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Farzon Lotfi
918313d17d
[SLPVectorizer] Support SLPVectorizer cases of tan across all backends (#95517)
This PR is intended to address the limited SLPVectorizer support of tan
raised in the comments of this PR:
https://github.com/llvm/llvm-project/pull/94559.

Right now emitting the tan intrinsisic allows you to vectorize tan, but
emitting the libfunc does not. to address this the libcall needs to be
mapped to the intrinsic. and the libcall and function name need to be
marked approriately so they can be optimized or defined as a call
lowering.
2024-06-27 15:15:13 -04:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Craig Topper
326ba38a99
[ValueTracking][RISCV] Use ConstantRange::getUnsignedMax instead of getUpper to simplify some code. (#96816)
This avoids the need to subtract 1 and explain why.
2024-06-26 17:30:19 -07:00