397 Commits

Author SHA1 Message Date
Stephen Tozer
86405ed101
[DebugInfo][Reassociate] Preserve DebugLocs when reassociating subs (#114226)
In NegateValue in Reassociate, we return the negation of an existing
value in order to break a subtract into an negate + add, potentially
creating a new instruction to perform the negation, but we neglect to
propagate the DebugLoc of the sub being replaced to the negate
instruction if one is created. This patch adds that propagation.

Found using https://github.com/llvm/llvm-project/pull/107279.
2024-11-08 18:35:03 +00:00
Kazu Hirata
2f55e55101
[Transforms] Use range-based for loops (NFC) (#98725) 2024-07-14 13:44:50 -07:00
Kazu Hirata
37d3f44a58
[Transforms] Use range-based for loops (NFC) (#98465) 2024-07-12 00:05:28 -07:00
Noah Goldstein
6e379de3b1 [Reassociate] Preserve nuw and nsw on mul chains
Basically the same rules as `add` but we also need to ensure all
operands a non-zero.

Proofs: https://alive2.llvm.org/ce/z/jzsYht

Closes #97040
2024-07-01 22:22:36 +08:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Nikita Popov
35eef9f97f [Reassociate] Use poison instead of undef for dummy operands (NFCI)
These will be replaced later.
2024-06-25 12:44:11 +02:00
Nikita Popov
5aaf2ab085 [Reassociate] Avoid use of ConstantExpr::getShl()
Use the constant folding API instead.
2024-06-18 16:59:51 +02:00
Shan Huang
470d59d656
[DebugInfo][Reassociate] Fix missing debug location drop (#95355)
Fix #95343 .
2024-06-17 09:06:20 +08:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Yingwei Zheng
645fb04a33
[Reassociate] Use uint64_t for repeat count (#94232)
This patch relands #91469 and uses `uint64_t` for repeat count to avoid
a miscompilation caused by overflow
https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.
2024-06-08 22:28:56 +08:00
Yingwei Zheng
22b63b97ff
Revert "[Reassociate] Drop weight reduction to fix issue 91417 (#91469)" (#94210)
Reverts
3bcccb6af6
and
9a282724a2
because #91469 causes a miscompilation
https://github.com/llvm/llvm-project/pull/91469#discussion_r1623925158.
2024-06-03 21:40:06 +08:00
Yingwei Zheng
3bcccb6af6
[Reassociate] Drop weight reduction to fix issue 91417 (#91469)
See the following case: https://alive2.llvm.org/ce/z/A-fBki
```
define i3 @src(i3 %0) {
  %2 = mul i3 %0, %0
  %3 = mul i3 %2, %0
  %4 = mul i3 %3, %0
  %5 = mul nsw i3 %4, %0
  ret i3 %5
}

define i3 @tgt(i3 %0) {
  %2 = mul i3 %0, %0
  %5 = mul nsw i3 %2, %0
  ret i3 %5
}
```


d7aeefebd6
introduced weight reduction during weights combination of the same
operand. As the weight of `%0` changes from 5 to 3, the nsw flag in `%5`
should be dropped.

However, the nsw flag isn't cleared by `RewriteExprTree` since `%5 = mul
nsw i3 %0, %4` is not included in the range of `[ExpressionChangedStart,
ExpressionChangedEnd)`.
```
Calculated Rank[] = 3
Combine negations for:   %2 = mul i3 %0, %0
Calculated Rank[] = 4
Combine negations for:   %3 = mul i3 %0, %2
Calculated Rank[] = 5
Combine negations for:   %4 = mul i3 %0, %3
Calculated Rank[] = 6
Combine negations for:   %5 = mul nsw i3 %0, %4
LINEARIZE:   %5 = mul nsw i3 %0, %4
OPERAND: i3 %0 (1)
ADD USES LEAF: i3 %0 (1)
OPERAND:   %4 = mul i3 %0, %3 (1)
DIRECT ADD:   %4 = mul i3 %0, %3 (1)
OPERAND: i3 %0 (1)
OPERAND:   %3 = mul i3 %0, %2 (1)
DIRECT ADD:   %3 = mul i3 %0, %2 (1)
OPERAND: i3 %0 (1)
OPERAND:   %2 = mul i3 %0, %0 (1)
DIRECT ADD:   %2 = mul i3 %0, %0 (1)
OPERAND: i3 %0 (1)
OPERAND: i3 %0 (1)
RAIn:   mul i3  [ %0, #3] [ %0, #3] [ %0, #3] 
RAOut:  mul i3  [ %0, #3] [ %0, #3] [ %0, #3] 
RAOut after CSE reorder:        mul i3  [ %0, #3] [ %0, #3] [ %0, #3] 
RA:   %5 = mul nsw i3 %0, %4
TO:   %5 = mul nsw i3 %4, %0
RA:   %4 = mul i3 %0, %3
TO:   %4 = mul i3 %0, %0
```

The best way to fix this is to inform `RewriteExprTree` to clear flags
of the whole expr tree when weight reduction happens.

But I find that weight reduction based on Carmichael number never
happens in practice.
See the coverage result
https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/home/dtcxzyw/llvm-project/llvm/lib/Transforms/Scalar/Reassociate.cpp.html#L323

I think it would be better to drop `IncorporateWeight`.

Fixes #91417
2024-05-29 18:09:23 +08:00
Akshay Deodhar
73e22ff3d7
[Reassociate] Preserve NSW flags after expr tree rewriting (#93105)
We can guarantee NSW on all operands in a reassociated add expression
tree when:

- All adds in an add operator tree are NSW, AND either 
  - All add operands are guaranteed to be nonnegative, OR 
  - All adds are also NUW

- Alive2: 
- Nonnegative Operands
	- 3 operands: https://alive2.llvm.org/ce/z/G4XW6Q
	- 4 operands: https://alive2.llvm.org/ce/z/FWcZ6D
 - NUW NSW adds: https://alive2.llvm.org/ce/z/vRUxeC

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-05-28 11:05:38 -07:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Jeremy Morse
7e88d51760
[NFC][RemoveDIs] Have CreateNeg only accept iterators (#82999)
Removing debug-intrinsics requires that we always insert with an
iterator, not with an instruction position. To enforce that, we need to
eliminate the `Instruction *` taking functions. It's safe to leave the
insert-at-end-of-block functions as the intention is clear for debug
info purposes (i.e., insert after both instructions and debug-info at
the end of the function).

This patch demonstrates how that needs to happen. At a variety of
call-sites to the `CreateNeg` constructor we need to consider:
* Has this instruction been selected because of the operation it
performs? In that case, just call `getIterator` and pass an iterator in.
* Has this instruction been selected because of it's position? If so, we
need to keep the iterator identifying that position (see the 3rd hunk
changing Reassociate.cpp, although it's coincidentally not debug-info
significant).

This also demonstrates what we'll try and do with the constructor
methods going forwards: have one fully explicit set of parameters
including iterator, and another with default-arguments where the
block-to-insert-into argument defaults to nullptr / no-position,
creating an instruction that hasn't been inserted yet.
2024-02-29 13:00:29 +00:00
Yingwei Zheng
312cb34da6
[Reassociate] Preserve NUW flags after expr tree rewriting (#72360)
Alive2: https://alive2.llvm.org/ce/z/38KiC_
2023-12-09 16:45:48 +08:00
Craig Topper
533a0856bf Recommit "[Reassociate] Use disjoint flag to convert Or to Add. (#72772)"
Original message:
We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.
2023-12-06 14:16:56 -08:00
Craig Topper
92fccea2e5 Revert "[Reassociate] Use disjoint flag to convert Or to Add. (#72772)"
This reverts commit 78964457cf1bafe57a54629fafbd081452a9e528.

Looks like I didn't rebase this correctly before commit
2023-12-06 13:50:21 -08:00
Craig Topper
78964457cf
[Reassociate] Use disjoint flag to convert Or to Add. (#72772)
We still have to keep the noCommonBitsSet call to handle multiple reassociations in one pass. We'll lose the flag on the first reassociation.
2023-12-06 13:48:15 -08:00
Joshua Cao
72ffaa9156
[IR][TRE] Support associative intrinsics (#74226)
There is support for intrinsics in Instruction::isCommunative, but there
is no equivalent implementation for isAssociative. This patch builds
support for associative intrinsics with TRE as an application. TRE can
now have associative intrinsics as an accumulator. For example:
```
struct Node {
  Node *next;
  unsigned val;
}

unsigned maxval(struct Node *n) {
  if (!n) return 0;
  return std::max(n->val, maxval(n->next));
}
```
Can be transformed into:
```
unsigned maxval(struct Node *n) {
  struct Node *head = n;
  unsigned max = 0; // Identity of unsigned std::max
  while (true) {
    if (!head) return max;
    max = std::max(max, head->val);
    head = head->next;
  }
  return max;
}
```
This example results in about 5x speedup in local runs.

We conservatively only consider min/max and as associative for this
patch to limit testing scope. There are probably other intrinsics that
could be considered associative. There are a few consumers of
isAssociative() that could be impacted. Testing has only required to
Reassociate pass be updated.
2023-12-04 22:35:59 -08:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
Nikita Popov
80fa5a6377 [ValueTracking] Use SimplifyQuery in haveNoCommonBitsSet() (NFC)
Pass SimplifyQuery instead of unpacked list of arguments.
2023-10-10 11:39:59 +02:00
David Green
db32d11a38 [Reassociate] Keep flags for more unchanged operations
Reassociation destroys nsw/nuw flags from BinOps that are changed. But if the
expression at the end of a tree that was altered, but didn't change itself,
the flags do not need to be removed. For example, if %a, %b and %c are
reassociated in
  %x = add nsw i32 %a, %c
  %y = add nsw i32 %x, %b
  %z = add nsw i32 %y, %d
The value of %y and so add %y %d remains the same, and %z needn't drop the nsw
flags.
https://alive2.llvm.org/ce/z/_juAiV

Differential Revision: https://reviews.llvm.org/D154289
2023-07-03 10:05:40 +01:00
Quentin Colombet
a4e88cba18 [Reassociation] Only form CSE expressions for local operands
# TL;DR #
This patch constrains how much freedom the heuristic that tries to from CSE
expressions has. The added constrain is that the CSE-able expressions must be
within the same basic block as the expressions they get moved before.

 # Details #
The reassociation pass currently tweaks the rewrite of the final expression
towards surfacing pairs of operands that would be CSE-able.

This heuristic applies after the regular ordering of the expression.
The regular ordering uses the program structure to choose in which order each
subexpression is materialized. That order follows the topological order.

Now, to expose more CSE opportunities, this heurisitc effectively bypasses the
previous ordering normally defined by the program and pushes up sub-expressions
that are arbitrary deep in the CFG.
E.g., let's say the program order (top to bottom) gives `((a*b)*c)*d)*e` and
`b*e` appears the most in the program. The expression will be reordered in
`(((b*e)*a)*c)*d`

This reordering implies that all the sub expressions (in this example `xx*a`,
then `yy*c`, etc.) will need to appear after the CSE-able expression.

This may over-constrain where the (sub) expressions may live and in particular
it may create loop-dependent expressions.

This patch only allows to move expressions up the expression chain when the
related values are definied in the same basic block as the ones they
"push-down".

This constrain is far for being perfect but at least it avoids accidentally
creating loop dependent variables.

If we really want to expose CSE-able expressions in a proper way, we would need
a profitability metric and also make the decision globally as opposed to one
chain at a time.

I've put the new constrain behind an option to make comparing the old and new
versions easy. However, I believe that even if we find cases where the old
version performs better it is probably by accident. What I am aiming for with
this change is more predictability, then we can improve if need be.

This fixes www.llvm.org/PR61458

Differential Revision: https://reviews.llvm.org/D147457
2023-06-26 11:58:03 +02:00
Kazu Hirata
7b014a0732 [Scalar] Use range-based for loops (NFC) 2023-04-16 09:05:20 -07:00
Kazu Hirata
c8f9555c4d [Transforms] Use *{Set,Map}::contains (NFC) 2023-03-14 00:24:30 -07:00
Sanjay Patel
4ca25c66d4 [Reassociate] prevent partial undef negation replacement
As shown in the examples in issue #57683, we allow matching
vectors with poison (undef) in this transform (and possibly more),
but we can't then use the partially defined value as a replacement
value in other expressions blindly.

This seems to be avoided in simpler examples of reassociation,
and other passes should be able to clean up the redundant op
seen in these tests.
2022-09-12 12:28:34 -04:00
Nikita Popov
f42d92611d [Reassociate] Avoid ConstantExpr::getFNeg() (NFCI)
Use ConstantFoldUnaryOpOperand() instead. Also make the code below
robust against non-instruction users, just in case it doesn't fold.
2022-09-07 10:48:08 +02:00
Nikita Popov
8f3fd26b74 [Reassociate] Use getInsertionPointerAfterDef()
This simplifies the code and fixes handling for the callbr case,
where the instruction needs to be inserted in the normal
destination, rather than after the terminator.

Originally part of D129660.
2022-08-31 11:10:24 +02:00
Kazu Hirata
b18ff9c461 [Transform] Use range-based for loops (NFC) 2022-08-27 23:54:32 -07:00
Kazu Hirata
e20d210eef [llvm] Qualify auto (NFC)
Identified with readability-qualified-auto.
2022-08-07 23:55:27 -07:00
Warren Ristow
3bbd380a5b [Reassociate][NFC] Use an appropriate dyn_cast for BinaryOperator
In D129523, it was noted that there is are some questionable naked casts
from Instruction to BinaryOperator, which could be addressed by doing a
dyn_cast directly to BinaryOperator, avoiding the need for the later cast.
This cleans up that casting.

Reviewed By: nikic, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D130448
2022-07-25 10:24:43 -07:00
Warren Ristow
3089b411a4 [Reassociate][NFC] Consistent checking for FastMathFlags suitability
In D129523, it was noted that the approach to check whether a value can
have FastMathFlags was done in different ways, and they should be made
consistent.  This patch makes minor changes to fix that.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D130408
2022-07-24 17:44:30 -07:00
Warren Ristow
c650793049 [Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'
Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as
expected, and that enables FP reassociation. Only the two FMF flags
'reassoc' and 'nsz' are technically required to perform reassociation,
but disabling other unrelated FMF bits is needlessly suppressing the
optimization.

This patch fixes that needless suppression, and makes appropriate
adjustments to test-cases, fixing some outstanding TODOs in the process.

Fixes: #56483

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D129523
2022-07-15 11:44:35 -07:00
Warren Ristow
230c8c56f2 [Reassociate] Cleanup minor missed optimizations
In analyzing issue #56483, it was noticed that running `opt` with
`-reassociate` was missing some minor optimizations. For example,
there were cases where the running `opt` on IR with floating-point
instructions that have the `fast` flags applied, sometimes resulted in
less efficient code than the input IR (things like dead instructions
left behind, and missed reassociations). These were sometimes noted
in the test-files with TODOs, to investigate further. This commit
fixes some of these problems, removing some TODOs in the process.

FTR, I refer to these as "minor" missed optimizations, because when
running a full clang/llvm compilation, these inefficiencies are not
happening, as other passes clean that residue up. Regardless, having
cleaner IR produced by `opt`, makes assessing the quality of fixes done
in `opt` easier.
2022-07-14 08:21:04 -07:00
Nikita Popov
93cbdaef04 [Reassociate] Avoid ConstantExpr::get()
Use ConstantFoldBinaryOpOperands() instead, to handle the case
where not all binary ops have a constant expression variant.

This is a bit awkward because we only want to pop the element from
Ops once we're sure that it has folded.
2022-07-04 15:17:22 +02:00
Nuno Lopes
53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Philip Reames
ee7324b898 Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc] 2022-03-21 10:01:40 -07:00
serge-sans-paille
59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Kazu Hirata
fd7d40640d [llvm] Use range-based for loops (NFC) 2021-11-28 18:14:49 -08:00
Zarko Todorovski
0d3add216f [llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114372
2021-11-23 13:22:55 -05:00
Jay Foad
a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Chris Lattner
735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Arthur Eubanks
6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Sanjay Patel
6fd91be354 [Reassociate] allow or->add with shl operands
As discussed in:
https://llvm.org/PR49055

We invert instcombine's add->or transform here
because it makes it easier to identify factorization
transforms like the mul in the motivating test.

This extends the logic added with:
https://reviews.llvm.org/rG70472f3
https://reviews.llvm.org/rG93f3d7f

(I intentionally kept the formatting fix in this patch
to provide more context about the calling logic.)
2021-02-07 09:45:19 -05:00
Kazu Hirata
1238378f18 [llvm] Use pop_back_val (NFC) 2021-01-23 10:56:33 -08:00
Kazu Hirata
5d2529f28f [Scalar] Construct SmallVector with iterator ranges (NFC) 2020-12-28 19:55:18 -08:00
Roman Lebedev
7bf89c2174
[NFC][Reassociate] Delay checking isLoadCombineCandidate() until after ShouldConvertOrWithNoCommonBitsToAdd() but before haveNoCommonBitsSet()
This appears to improve -O3 compile-time performance somewhat:
https://llvm-compile-time-tracker.com/compare.php?from=87369c626114ae17f4c637635c119e6de0856a9a&to=c04b8271e1609b0dfb20609b40844b0c4324517e&stat=instructions
It doesn't look like delaying it until after haveNoCommonBitsSet() is better:
https://llvm-compile-time-tracker.com/compare.php?from=c04b8271e1609b0dfb20609b40844b0c4324517e&to=b2943d450eaf41b5f76d2dc7350f0a279f64cd99&stat=instructions
2020-11-18 23:57:12 +03:00
Roman Lebedev
34ff90ad5d
[Reassociate] Don't convert add-like-or's into add's if they appear to be part of load-combining idiom
As Wei Mi is reporting in post-commit review
  https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20201116/853479.html
teaching -reassociate about add-like-or's (70472f3) results in breaking apart
load widening patterns, and reassociating them.

For now, simply exclude any such `or` that appears to be a root of
load widening idiom from the or->add transformation.

Note that the heuristic is greedy, it doesn't ensure that loads
can *actually* be widened into a single load.
2020-11-18 17:55:02 +03:00
Roman Lebedev
93f3d7f7b3
[Reassociate] Guard add-like or conversion into an add with profitability check
This is slightly better compile-time wise,
since we avoid potentially-costly knownbits analysis that will
ultimately not allow us to actually do anything with said `add`.
2020-11-04 16:10:34 +03:00