1027 Commits

Author SHA1 Message Date
Yingwei Zheng
caa2258250
[LLVM] Remove nuw neg (#86295)
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.

Motivated by
https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134.

Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
2024-03-26 20:56:16 +08:00
Noah Goldstein
b3ee127e7d [InstCombine] integrate N{U,S}WAddLike into existing folds
Just went a quick replacement of `N{U,S}WAdd` with the `Like` variant
that old matches `or disjoint`

Closes #86082
2024-03-21 13:03:38 -05:00
Stephen Tozer
ffd08c7759
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:

- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.

Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:

```
  DPValue -> DbgVariableRecord
  DPVal -> DbgVarRec
  DPV -> DVR
```

Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
2024-03-19 20:07:07 +00:00
Yingwei Zheng
0b59af4d86
[InstCombine] Clear sign-bit of the constant magnitude in copysign (#85787)
Alive2: https://alive2.llvm.org/ce/z/vFykcZ
Address the comment
https://github.com/llvm/llvm-project/pull/85772#discussion_r1530179048.

Unfortunately, non-splat vector constants are not supported because we
haven't implemented constant folding of fabs with vector operands.
2024-03-20 03:28:19 +08:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Yingwei Zheng
83d178843f
[InstCombine] Set zero_is_poison for ctlz/cttz if they are only used as shift amounts (#85035)
Alive2: https://alive2.llvm.org/ce/z/r-67t9

It would improve the codegen if the target doesn't provide a defined
value for ctlz/cttz with zero.
2024-03-13 21:52:40 +08:00
elhewaty
3f302eaca4
[InstCombine] Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 (#82280)
- Fixes: https://github.com/llvm/llvm-project/issues/82177
- Alive2: https://alive2.llvm.org/ce/z/Q7mMC3
2024-03-11 15:10:40 +01:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Yingwei Zheng
a1a590ef12
[InstCombine] Fix miscompilation in PR83947 (#83993)
762f762504/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (L394-L407)

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes #83947.
2024-03-05 22:34:04 +08:00
AtariDreams
081882eb9c
[InstCombine] Remove m_OneUse requirement for max, but not min (#81505)
If it is ever determined that min doesn't need one-use, then we can
remove the one-use requirement entirely.
2024-03-05 02:45:45 +08:00
Yingwei Zheng
641d160ad2
[InstCombine] Fold umax(smax)/smin(umin) with non-negative constants (#82929)
This patch extends `reassociateMinMaxWithConstants` to fold the
following patterns:
```
umax (smax X, nneg C0), nneg C1 --> smax X, (umax C0, C1)
smin (umin X, nneg C0), nneg C1 --> umin X, (smin/umin C0, C1)
```
Alive2: https://alive2.llvm.org/ce/z/wfEj-e

Address the comment
https://github.com/llvm/llvm-project/pull/82472#pullrequestreview-1896922897.
2024-02-26 03:26:55 +08:00
elhewaty
69d4890f80
[InstCombine] Fold abs(a * abs(b)) --> abs(a * b) (#78110)
Proof: https://alive2.llvm.org/ce/z/hfbEra
Fixes: https://github.com/llvm/llvm-project/issues/73211
2024-02-19 11:54:51 +01:00
ostannard
44706bd4f0
[InstCombine] Don't add fcmp instructions to strictfp functions (#81498)
The strictfp attribute has the requirement that "LLVM will not introduce
any new floating-point instructions that may trap". The llvm.is.fpclass
intrinsic is documented as "The function never raises floating-point
exceptions", and the fcmp instruction may raise one, so we can't
transform the former into the latter in functions with the strictfp
attribute.
2024-02-13 09:13:22 +00:00
Florian Hahn
c609846155
[TBAA] Extract logic to use TBAA tag for field of !tbaa.struct (NFC). (#81284) 2024-02-12 11:19:02 +00:00
AtariDreams
3746294451
[Transforms] Add more cos combinations to SimplifyLibCalls and InstCombine (#79699)
Add cos(fabs(x)) -> cos(x) and cos(copysign(x, y)) -> cos(x).
2024-02-06 00:30:18 +05:30
Yingwei Zheng
930996e9e4
[ValueTracking][NFC] Pass SimplifyQuery to computeKnownFPClass family (#80657)
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.

Example (extracted from
[fmt/format.h](e17bc67547/include/fmt/format.h (L3555-L3566))):
```
define float @test(float %x, i1 %cond) {
  %i32 = bitcast float %x to i32
  %cmp = icmp slt i32 %i32, 0
  br i1 %cmp, label %if.then1, label %if.else

if.then1:
  %fneg = fneg float %x
  br label %if.end

if.else:
  br i1 %cond, label %if.then2, label %if.end

if.then2:
  br label %if.end

if.end:
  %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
  %ret = call float @llvm.fabs.f32(float %value)
  ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
2024-02-06 02:30:12 +08:00
Yingwei Zheng
50e80e06d1
[ValueTracking] Merge cannotBeOrderedLessThanZeroImpl into computeKnownFPClass (#76360)
This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into
`computeKnownFPClass` to improve the signbit inference.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-01-31 18:26:50 +08:00
Stephen Tozer
632f44e5ed
[RemoveDIs][DebugInfo] Handle DPVAssign in most transforms (#78986)
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
2024-01-23 16:16:59 +00:00
Yingwei Zheng
3d795bdd4d
[InstCombine] Handle a bitreverse idiom which ends with a bswap (#77677)
This patch handles the following `bitreverse` idiom, which is found in
8bd6445acc/absl/crc/internal/crc.cc (L75-L80):

```
uint32_t ReverseBits(uint32_t bits) {
  bits = (bits & 0xaaaaaaaau) >> 1 | (bits & 0x55555555u) << 1;
  bits = (bits & 0xccccccccu) >> 2 | (bits & 0x33333333u) << 2;
  bits = (bits & 0xf0f0f0f0u) >> 4 | (bits & 0x0f0f0f0fu) << 4;
  return absl::gbswap_32(bits);
}
```

Alive2: https://alive2.llvm.org/ce/z/ZYXNmj
2024-01-11 15:15:12 +08:00
Gabriel Baraldi
a87fa7f0ca
[InstCombine] Dont throw away noalias/alias scope metadata when inlining memcpys (#74805)
This was found in julia when we changed some operations from explicit
loads + stores to memcpys. While applying it to both the src and the
dest seems weird, thats what we do for normal TBAA.
2024-01-04 17:04:31 +01:00
Yingwei Zheng
0ce193708c
[InstCombine] Refactor folding of commutative binops over select/phi/minmax (#76692)
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.

Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
2024-01-04 15:11:28 +08:00
hstk30-hw
4b2f1184fc
Skip tranformConstExprCastCall for naked function (#76496)
Fix this issue https://github.com/llvm/llvm-project/issues/72843 .

For naked function, assembly might be using an argument, or otherwise
rely on the frame layout, so don't transformConstExprCastCall
2024-01-01 22:52:13 +08:00
Yingwei Zheng
345d7b1618
[InstCombine] Fold minmax intrinsic using KnownBits information (#76242)
This patch tries to fold minmax intrinsic by using
`computeConstantRangeIncludingKnownBits`.
Fixes regression in
[_karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c](c31943af16/Modules/_decimal/libmpdec/mpdecimal.c (L5460-L5462)),
which was introduced by #71396.
See also
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/16#issuecomment-1865875756.

Alive2 for splat vectors with undef: https://alive2.llvm.org/ce/z/J8hKWd
2023-12-23 04:41:32 +08:00
Chia
8674a023bc
[InstCombine] fold (Binop phi(a, b) phi(b, a)) -> (Binop a, b) while Binop is commutative. (#75765)
Alive2 proof: https://alive2.llvm.org/ce/z/2P8gq-
This patch closes #73905
2023-12-21 22:47:21 +08:00
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Benjamin Kramer
60aeea21fd [InstCombine] Fix uninitialized variable usage
m_Specific can only be used if the previous check suceeded. Found by
msan.
2023-12-13 16:31:19 +01:00
Sizov Nikita
88cc35b27e
[InstCombine] Fold binop (select cond, a, b), (select cond, b, a) to binop a, b (#74953)
```
CommutativeBinOp(select(V, A, B), select(V, B, A) --> CommutativeBinOp(A, B)
CommutativeIntrinsicCall(select(V, A, B), select(V, B, A), ...) --> CommutativeIntrinsicCall(A, B, ...)
```

https://alive2.llvm.org/ce/z/8CDUZ4

Closes #73904
2023-12-13 14:09:27 +08:00
Sizov Nikita
827f8a7ef6
Add opt with ctlz and shifts of power of 2 constants (#74175)
This patch does the following simplifications:
```
cttz(shl(C, X), 1) -> add(cttz(C, 1), X)
cttz(lshr exact(C, X), 1) -> sub(cttz(C, 1), X)
ctlz(lshr(C, X), 1) --> add(ctlz(C, 1), X)
ctlz(shl nuw (C, X), 1) --> sub(ctlz(C, 1), X)
```
Alive2: https://alive2.llvm.org/ce/z/9KHlKc
Closes #41333
2023-12-08 15:06:23 +08:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Tom Stellard
2750a22745
Passes: Consolidate EnableKnowledgeRetention declarations into a header file (#71695) 2023-11-13 11:03:49 -08:00
Noah Goldstein
cc8341872d [InstCombine] Preserve return attributes when merging llvm.ptrmask
If we have assosiated attributes i.e `([ret_attrs] (ptrmask (ptrmask
p0, m0), m1))` we should preserve `[ret_attrs]` when combining the two
`llvm.ptrmask`s.

Differential Revision: https://reviews.llvm.org/D156638
2023-11-01 23:50:36 -05:00
Noah Goldstein
51abbf98d1 [InstCombine] Deduce align and nonnull return attributes for llvm.ptrmask
We can deduce the former based on the mask / incoming pointer
alignment.  We can set the latter based if know the result in non-zero
(this is essentially just caching our analysis result).

Differential Revision: https://reviews.llvm.org/D156636
2023-11-01 23:50:35 -05:00
Noah Goldstein
edb9e9a5fb [InstCombine] Implement SimplifyDemandedBits for llvm.ptrmask
Logic basically copies 'and' but we can't return a constant if the
result == rhs (mask) so that case is skipped.
2023-11-01 23:50:35 -05:00
Nikita Popov
0b5e0fb62d [InstCombine] Avoid some uses of ConstantExpr::getIntegerCast() (NFC)
Use IRBuilder or ConstantFolding instead.
2023-11-01 11:41:50 +01:00
Nikita Popov
eb86de63d9
[IR] Require that ptrmask mask matches pointer index size (#69343)
Currently, we specify that the ptrmask intrinsic allows the mask to have
any size, which will be zero-extended or truncated to the pointer size.

However, what semantics of the specified GEP expansion actually imply is
that the mask is only meaningful up to the pointer type *index* size --
any higher bits of the pointer will always be preserved. In other words,
the mask gets 1-extended from the index size to the pointer size. This
is also the behavior we want for CHERI architectures.

This PR makes two changes:
* It spells out the interaction with the pointer type index size more
explicitly.
* It requires that the mask matches the pointer type index size. The
intention here is to make handling of this intrinsic more robust, to
avoid accidental mix-ups of pointer size and index size in code
generating this intrinsic. If a zero-extend or truncate of the mask is
desired, it should just be done explicitly in IR. This also cuts down on
the amount of testing we have to do, and things transforms needs to
check for.

As far as I can tell, we don't actually support pointers with different
index type size at the SDAG level, so I'm just asserting the sizes match
there for now. Out-of-tree targets using different index sizes may need
to adjust that code.
2023-10-24 09:54:29 +02:00
Kerry McLaughlin
b0cc47c959
[InstCombine] Remove scalable vector extracts to and from the same type (#69702)
visitCallInst already looks for fixed width vector extracts where number of
elements in the source and destination types are equal. This patch modifies
the function to also identify scalable extracts which can be removed.
2023-10-23 11:21:49 +01:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
6cd5eb1f54 [InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC)
Add helpers getLosslessUnsignedTrunc/getLosslessSignedTrunc for
this common pattern.
2023-09-28 17:02:33 +02:00
Matt Arsenault
07acfe3a4d
ADT: Replace FPClassTest fabs with inverse_fabs and unknown_sign (#66390) 2023-09-14 19:46:53 +03:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Qi Hu
1a65cd3fcf [InstCombine] Optimize implementations of min/max for bool
umin.i1 -> and : https://alive2.llvm.org/ce/z/6FNH6k
smin.i1 -> or : https://alive2.llvm.org/ce/z/h96S6o
umax.i1 -> or : https://alive2.llvm.org/ce/z/XHdeVk
smax.i1 -> and : https://alive2.llvm.org/ce/z/fkxKJx
umin.v4i1 -> and : https://alive2.llvm.org/ce/z/yV4VgP
smin.v4i1 -> or : https://alive2.llvm.org/ce/z/e9TF68
umax.v4i1 -> or : https://alive2.llvm.org/ce/z/tfNyfK
smax.v4i1 -> and : https://alive2.llvm.org/ce/z/0__Af2

Reviewed By: goldstein.w.n, bryanpkc

Differential Revision: https://reviews.llvm.org/D158915
2023-09-07 10:28:54 -04:00
Kazu Hirata
83e6931827 [llvm] Use llvm::is_contained (NFC) 2023-09-02 09:32:46 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Matt Arsenault
5ae881ff0a InstCombine: Fold out scale-if-denormal pattern
Fold select (fcmp oeq x, 0), (fmul x, y), x => x

This cleans up a pattern left behind by denormal range checks under
denormals are zero.

The pattern starts out as something like:
  x = x < smallest_normal ? x * K : x;

The comparison folds to an == 0 when the denormal mode treats input
denormals as zero. This makes library denormal checks free after
linked into DAZ enabled code.

alive2 is mostly happy with this, but there are some issues. First,
there are many reported failures in some of the negative tests that
happen to trigger some preexisting canonicalize introducing
combine. Second, alive2 is incorrectly asserting that denormals must
be flushed with the DAZ modes. It's allowed to drop a canonicalize.

https://reviews.llvm.org/D157030
2023-09-01 07:47:12 -04:00
Matt Arsenault
2b582440c1 InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp
This is a better canonical form. fcmp and fabs are more widely
understood and fabs can fold for free into some sources.

Addresses todo from D146170

https://reviews.llvm.org/D159084
2023-08-29 17:58:15 -04:00
Zhongyunde
4225f54bf5 [InstCombine] Fold abs of known sign operand when source is sub
abs(x-y) --> x-y where x >= y, done on D122013
abs(x-y) --> y-x where x <= y

proofs: https://alive2.llvm.org/ce/z/KkeEsd

Reviewed By: goldstein.w.n, nikic
Differential Revision: https://reviews.llvm.org/D156499
2023-08-07 11:55:11 +08:00
Matt Arsenault
d74c89fdb4 InstCombine: Drop some typed pointer bitcasts 2023-07-31 08:05:58 -04:00
Matt Arsenault
d388222be2 InstCombine: Drop some typed pointer bitcast handling 2023-07-31 08:05:12 -04:00