1516 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
f1632d25db
IR: introduce ICmpInst::isImpliedByMatchingCmp (#122597)
Create an abstraction over isImplied{True,False}ByMatchingCmp to
faithfully communicate the result of both functions, cleaning up code in
callsites. While at it, fix a bug in the implied-false version of the
function, which was inadvertedenly dropping samesign information.
2025-01-13 16:20:00 +00:00
Ramkumar Ramachandra
66badf224a
VT: teach a special-case optz about samesign (#122590)
There is a narrow special-case in isImpliedCondICmps that can benefit
from being taught about samesign. Since it costs us nothing to implement
it, teach it about samesign, for completeness. This patch marks the
completion of the effort to teach ValueTracking about samesign.
2025-01-12 15:19:29 +00:00
goldsteinn
17ef436e3d
[ValueTracking] Take into account whether zero is poison when computing CR for ct{t,l}z (#122548) 2025-01-11 15:11:11 -06:00
Ramkumar Ramachandra
f38c40bff3
VT: teach isImpliedCondMatchingOperands about samesign (#122474)
Move isImplied{True,False}ByMatchingCmp from CmpInst to ICmpInst, so
that it can operate on CmpPredicate instead of CmpInst::Predicate, and
teach it about samesign. There are two callers of this function, and we
choose to migrate the one in ValueTracking, namely
isImpliedCondMatchingOperands to CmpPredicate, hence teaching it about
samesign, with visible test impact.
2025-01-11 09:08:57 +00:00
Alex MacLean
59ced72bc2
[ValueTracking] Add rotate idiom to haveNoCommonBitsSet special cases (#122165)
An occasional idiom for rotation is "(A << B) + (A >> (BitWidth - B))".
Currently this is not well handled on targets with native
funnel-shift/rotate support. Add a special case to haveNoCommonBitsSet
to ensure that the addition is converted to a disjoint or in InstCombine
so during instruction selection the idiom can be converted to an
efficient rotation implementation.

Proof: https://alive2.llvm.org/ce/z/WdCZsN
2025-01-10 09:17:44 -08:00
Ramkumar Ramachandra
cfee344dda
VT: teach implied-cond-cr about samesign (#122447)
Teach isImpliedCondCommonOperandWithCR about samesign, noting that the
only case we need to handle is when exactly one of the icmps have
samesign.
2025-01-10 14:26:49 +00:00
Ramkumar Ramachandra
b53e79422a
VT: teach isImpliedCondOperands about samesign (#120263)
isImpliedCondICmps() and its callers in ValueTracking can greatly
benefit from being taught about samesign. As a first step, teach one
caller, namely isImpliedCondOperands(). Very minimal changes are
required for this, as CmpPredicate::getMatching() does most of the work.
2025-01-10 12:07:56 +00:00
Yingwei Zheng
03e7862962
[ValueTracking] Move getFlippedStrictnessPredicateAndConstant into ValueTracking. NFC. (#122064)
Needed by https://github.com/llvm/llvm-project/pull/121958.
2025-01-08 20:02:49 +08:00
adam-bzowski
088d636136
[ValueTracking] Fix a bug for signed min-max clamping (#121206)
Correctly handle the case where the clamp is over the full range.
This fixes an issue introduced in #121206.
2024-12-28 18:21:47 +01:00
adam-bzowski
6d7cf5206f
[ValueTracking] Improve KnownBits for signed min-max clamping (#120576)
A signed min-max clamp is the sequence of smin and smax intrinsics,
which constrain a signed value into the range: smin <= value <= smax.
The patch improves the calculation of KnownBits for a value subjected to
the signed clamping.
2024-12-25 22:39:56 +08:00
tianleliu
d7fe2cf8a2
[InstCombine] Widen Sel width after Cmp to generate Max/Min intrinsics. (#118932)
When Sel(Cmp) are in different integer type,

From: (K and N mean width, K < N; a and b are src operands.)
bN = Ext(bK)
cond = Cmp(aN, bN)
aK = Trunc aN
retK = Sel(cond, aK, bK)
To:
bN = Ext(bK)
cond = Cmp(aN, bN)
retN = Sel(cond, aN, bN)
retK = Trunc retN

Though Sel's operands width becomes larger, the benefit
of making type width in Sel the same as Cmp, is for combing
to max/min intrinsics, and also better performance for SIMD
instructions.
References of correctness: https://alive2.llvm.org/ce/z/Y4Kegm
                           https://alive2.llvm.org/ce/z/qFtjtR
Reference of generated code comparision:
                           https://gcc.godbolt.org/z/o97svGvYM
                           https://gcc.godbolt.org/z/59Ynj91ov
2024-12-18 09:02:11 +08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Yingwei Zheng
a67bd94fda
[ValueTracking] Add missing operand checks in computeKnownFPClassFromCond (#119579)
After https://github.com/llvm/llvm-project/pull/118257, we may call
`computeKnownFPClassFromCond` with unrelated conditions. Then
miscompilations may occur due to a lack of operand checks.

This bug was introduced by
d2404ea6ce
and https://github.com/llvm/llvm-project/pull/80740. However, the
miscompilation couldn't have happened before
https://github.com/llvm/llvm-project/pull/118257, because we only added
related conditions to `DomConditionCache/AssumptionCache`.

Fix the miscompilation reported in
https://github.com/llvm/llvm-project/pull/118257#issuecomment-2536182166.
2024-12-12 10:30:37 +08:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00
Yingwei Zheng
16ec534989
[ValueTracking] Handle and/or of conditions in computeKnownFPClassFromContext (#118257)
Fix a typo introduced by
https://github.com/llvm/llvm-project/pull/83161.
This patch also supports decomposition of and/or expressions in
`computeKnownFPClassFromContext`.
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=688bb432c4b618de69a1d0e7807077a22f15762a&to=07493fc354b686f0aca79d6f817091a757bd7cd5&stat=instructions:u
2024-12-02 21:00:55 +08:00
Veera
979a0356d4
[InstCombine] Fold X Pred C2 ? X BOp C1 : C2 BOp C1 to min/max(X, C2) BOp C1 (#116888)
Fixes #82414.

General Proof: https://alive2.llvm.org/ce/z/ERjNs4 
Proof for Tests: https://alive2.llvm.org/ce/z/K-934G

This PR transforms `select` instructions of the form `select (Cmp X C1)
(BOp X C2) C3` to `BOp (min/max X C1) C2` iff `C3 == BOp C1 C2`.

This helps in eliminating a noop loop in
https://github.com/rust-lang/rust/issues/123845 but does not improve
optimizations.
2024-12-02 09:33:45 +01:00
Tex Riddell
818d715989
[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Return true for atan2 from isTriviallyVectorizable
- Add atan2 to VecFuncs.def for massv and accelerate libraries.
- Add atan2 to hasOptimizedCodeGen
- Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp
llvm::getIntrinsicForCallSite and update vectorization tests
- Add atan2 name check to isLoweredToCall in
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
- Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case

Thanks to @jroelofs for the atan2 accelerate veclib and associated test
additions, plus the hasOptimizedCodeGen addition.

Part of: Implement the atan2 HLSL Function #70096.
2024-11-08 16:07:38 -08:00
Matt Arsenault
d74b1f029d
ValueTracking: Do not return nullptr from getUnderlyingObject (#115258)
Fixup for 29a5c054e6d56a912ed5ba3f84e8ca631872db8b. The failure case
should return the last value found.
2024-11-07 07:35:33 -08:00
Ramkumar Ramachandra
fef6613e9f
ValueTracking: simplify udiv/urem recurrences (#108973)
A urem recurrence has the property that the result can never exceed the
start value. A udiv recurrence has the property that the result can
never exceed either the start value or the numerator, whichever is
greater. Implement a simplification based on these properties.
2024-11-07 11:41:35 +00:00
Nikita Popov
2d7f34f2a5
[ValueTracking] Don't special case depth for phi of select (#114996)
As discussed on
https://github.com/llvm/llvm-project/pull/114689#pullrequestreview-2411822612
and following, there is no principled reason why the phi of select case
should have a different recursion limit than the general case. There may
still be fan-out, and there may still be indirect recursion. Revert that
part of #113707.
2024-11-07 10:14:28 +01:00
Matt Arsenault
29a5c054e6
ValueTracking: Allow getUnderlyingObject to look at vectors (#114311)
We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.
2024-11-06 17:14:44 -08:00
Kazu Hirata
236fda550d
[Analysis] Remove unused includes (NFC) (#114936)
Identified with misc-include-cleaner.
2024-11-05 19:11:34 -08:00
Yingwei Zheng
f1e1055c84
[ValueTracking] Compute known bits from recursive select/phi (#113707)
This patch is inspired by
https://github.com/llvm/llvm-project/pull/113686. I found that it
removes a lot of unnecessary "and X, 1" in some applications that
represent boolean values with int.
2024-11-02 15:45:46 +08:00
David Green
0f919444ad
[ValueTracking] Handle recursive phis in knownFPClass (#114008)
As a follow-on to 113686, this breaks the recursion between phi nodes
that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be
calculated from the classes of p1 and p2.
2024-11-01 13:38:29 +00:00
David Green
9735c05186
[ValueTracking] Compute KnownFP state from recursive select/phi. (#113686)
Given a recursive phi with select:
 %p = phi [ 0, entry ], [ %sel, loop]
 %sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi
pair can only be the initial state (0 here) or from %other. This adds a
short-cut into computeKnownFPClass for PHI to detect that the select is
recursive back to the phi, and if so use the state from the other
operand.

This helps to address a regression from #83200.
2024-10-31 07:50:44 +00:00
Rohit Aggarwal
dfb60bb919
Adding more vector calls for -fveclib=AMDLIBM (#109662)
AMD has it's own implementation of vector calls.
New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos
Please refer [https://github.com/amd/aocl-libm-ose]

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-10-29 10:09:55 +00:00
Yingwei Zheng
aad3a1630e
[ValueTracking] Respect samesign flag in isKnownInversion (#112390)
In https://github.com/llvm/llvm-project/pull/93591 we introduced
`isKnownInversion` and assumes `X` is poison implies `Y` is poison
because they share common operands. But after introducing `samesign`
this assumption no longer hold if `X` is an icmp has `samesign` flag.

Alive2 link: https://alive2.llvm.org/ce/z/rj3EwQ (Please run it locally
with this patch and https://github.com/AliveToolkit/alive2/pull/1098).

This approach is the most conservative way in my mind to address this
problem. If `X` has `samesign` flag, it will check if `Y` also has this
flag and make sure constant RHS operands have the same sign.

Fixes https://github.com/llvm/llvm-project/issues/112350.
2024-10-17 00:27:21 +08:00
Alexey Bader
583fa4f5b7
[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088)
Today, InstCombine can fold fcmp+select patterns to minnum/maxnum
intrinsics when the nnan and nsz flags are set. The ordering of the
operands in both the fcmp and select instructions is important for the
folding to occur.

maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult}

The second pattern is supposed to make the order of the operands in the
select instruction irrelevant. However, the pattern matching code uses
the CmpInst::getInversePredicate method to invert the comparison
predicate. This method doesn't take into account the fast-math flags,
which can lead missing the folding opportunity.

The patch extends the pattern matching code to handle unordered fcmp
instructions. This allows the folding to occur even when the select
instruction has the operands in the inverse order.

New maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt}

The same changes are applied to the minnum intrinsic.
2024-10-15 22:05:16 +04:00
Ramkumar Ramachandra
1c6c850937
InstCombine: extend select-equiv to support vectors (#111966)
foldSelectEquivalence currently doesn't support GVN-like replacements on
vector types. Put in the checks for potentially lane-crossing
operations, and lift the limitation.
2024-10-15 11:10:45 +01:00
Ramkumar Ramachandra
bdf241cab3
ValueTracking: handle more ops in isNotCrossLaneOperation (#112183)
Reuse llvm::isTriviallyVectorizable in llvm::isNotCrossLaneOperation, in
order to get it to handle more intrinsics.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/XSV_GT
2024-10-14 14:08:12 +01:00
Ramkumar Ramachandra
c5f82f7893
ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-14 11:37:30 +01:00
Ramkumar Ramachandra
78089d5845
ValueTracking: refactor recurrence-matching (NFC) (#109659) 2024-10-04 17:45:29 +01:00
Florian Hahn
dce5bf8efc
[ValueTracking] AllowEphemerals for alignment assumptions. (#108632)
Allow AllowEphemerals in isValidAssumeForContext, as the CxtI might
be the producer of the pointer in the bundle. At the moment, align
assumptions aren't optimized away.

This allows using the assumption in the computeKnownBits call in
getConstantMultipleImpl.

We could extend the computeKnownBits API to allow callers to specify if
ephemerals are allowed, if the info from computeKnownBitsFromContext is
used to remove alignment assumptions.

PR: https://github.com/llvm/llvm-project/pull/108632
2024-10-03 16:02:34 +01:00
Nikita Popov
eb85285727
[ValueTracking] mul nuw nsw with factor sgt 1 is non-negative (#110803)
Proof: https://alive2.llvm.org/ce/z/bC0eJf
2024-10-02 15:16:56 +02:00
Ramkumar Ramachandra
d432e22b2f
ValueTracking: strip stray break in recur-match (#109794)
There is a stray break statement in the recurrence-handling code in
computeKnownBitsFromOperator, that seems to be unintended. Strip this
statement so that we have the opportunity to go through the rest of
phi-handling code, and refine KnownBits further.
2024-10-02 11:18:48 +01:00
Nikita Popov
dd599e92a6
[ValueTracking] Support assume in entry block without DT (#109264)
isValidAssumeForContext() handles a couple of trivial cases even if no
dominator tree is available. This adds one more for the case where there
is an assume in the entry block, and a use in some other block. The
entry block always dominates all blocks.

As having context instruction but not having DT is fairly rare, there is
not much impact. Only test change is in assume-builder.ll, where less
redundant assumes are generated. I've found having this special case is
useful for an upcoming change though.
2024-09-19 14:24:55 +02:00
Yingwei Zheng
2ca75df1d1
[ValueTracking] Infer is-power-of-2 from dominating conditions (#107994)
Addresses downstream rustc issue:
https://github.com/rust-lang/rust/issues/129795
2024-09-13 08:54:29 +08:00
Yingwei Zheng
ffcff4af59
[ValueTracking] Infer is-power-of-2 from assumptions. (#107745)
This patch tries to infer is-power-of-2 from assumptions. I don't see
that this kind of assumption exists in my dataset.
Related issue: https://github.com/rust-lang/rust/issues/129795

Close https://github.com/llvm/llvm-project/issues/58996.
2024-09-10 10:38:21 +08:00
Nikita Popov
9707b98e57 [ConstantRange] Perform increment on APInt (NFC)
This handles the edge case where BitWidth is 1 and doing the
increment gets a value that's not valid in that width, while we
just want wrap-around.

Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 16:11:00 +02:00
Simon Pilgrim
6c8746b6e3
[Analysis] getIntrinsicForCallSite - add vectorization support for acos/asin/atan and cosh/sinh/tanh libcalls (#106844)
Followup to #106584 - ensure acos/asin/atan and cosh/sinh/tanh libcalls correctly map to the llvm intrinsic equivalents
2024-09-03 10:05:56 +01:00
Alex MacLean
369d8148e0
[ValueTracking] use KnownBits to compute fpclass from bitcast (#97762)
When we encounter a bitcast from an integer type we can use the
information from `KnownBits` to glean some information about the
fpclass:
- If the sign bit is known, we can transfer this information over. 
- If the float is IEEE format and enough of the bits are known, we may
  be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.
2024-08-30 07:34:49 -07:00
Noah Goldstein
42ce62800d [ValueTracking] Handle incompatible types instead of asserting in isKnownNonEqual; NFC
Downstream hit this assert, since it doesn't really make any
difference, just change code to return false.
2024-08-19 15:48:45 -07:00
Kazu Hirata
217f5804ca
[Analysis] Use a range-based for loop (NFC) (#104445) 2024-08-15 17:59:23 -07:00
Nikita Popov
afa0f53f96 [ValueTracking] Fix f16 fptosi range for large integers
We were missing the signed flag on the negative value, so the
range was incorrectly interpreted for integers larger than 64-bit.

Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-15 18:18:19 +02:00
Simon Pilgrim
11ba72e651
[KnownBits] Add KnownBits::add and KnownBits::sub helper wrappers. (#99468) 2024-08-12 10:21:28 +01:00
zhongyunde 00443407
2bd568fecc [ValueTracking] Infer relationship for the select with SLT 2024-08-06 10:30:04 +08:00
zhongyunde 00443407
3023713014 [ValueTracking] Infer relationship for the select with ICmp
x -nsw y < -C is false when x > y and C >= 0
Alive2 proof for sgt, sge : https://alive2.llvm.org/ce/z/tupvfi
Note: It only really makes sense in the context of signed comparison for
      "X - Y must be positive if X >= Y and no overflow".

Fixes https://github.com/llvm/llvm-project/issues/54735
2024-08-06 10:30:03 +08:00
Kazu Hirata
7df9da7d78
[llvm] Construct SmallVector with ArrayRef (NFC) (#101872) 2024-08-04 08:54:23 -07:00
Vitaly Buka
945dd9a740
[NFC][Load] Find better place for mustSuppressSpeculation (#100794)
And extract `suppressSpeculativeLoadForSanitizers`.

For #100639.
2024-07-29 10:29:02 -07:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00