667 Commits

Author SHA1 Message Date
Jeremy Morse
4b64138ba4
[DebugInfo][RemoveDIs] Switch some insertion routines to use iterators (#75330)
As part of RemoveDIs, we need instruction insertion to be done with
iterators rather than instruction pointers, so that we can communicate
some debug-info facts about the position. This patch is an entirely
mechanical replacement of Instruction * with BasicBlock::iterator, plus
using insertBefore to insert some instructions because we don't have
iterator-taking constructors yet.

Sadly it's not NFC because it causes dbg.value intrinsics / their
DPValue equivalents to shift location.
2023-12-13 14:04:35 +00:00
Jeremy Morse
5ba5211a47
[DebugInfo][RemoveDIs] Have LICM insert at iterator positions (#73671)
Because we're storing some extra debug-info information in the iterator
class, we need to insert new LICM-created stores using such iterators.
Switch LICM to storing iterators instead of pointers when it promotes
variables in loops, add a test for the desired behaviour, and enable
RemoveDIs instrumentation on a variety of other LICM tests for good
measure.

(This would appear to be the only pass in LLVM that needs to store
iterators on the heap).
2023-11-30 13:00:26 +00:00
Nikita Popov
4b3ea337ad [ValueTracking] Convert isKnownNonNegative() to use SimplifyQuery (NFC) 2023-11-29 10:52:52 +01:00
Nikita Popov
6b8ed78719 [IR] Add writable attribute
This adds a writable attribute, which in conjunction with
dereferenceable(N) states that a spurious store of N bytes is
introduced on function entry. This implies that this many bytes
are writable without trapping or introducing data races. See
https://llvm.org/docs/Atomics.html#optimization-outside-atomic for
why the second point is important.

This attribute can be added to sret arguments. I believe Rust will
also be able to use it for by-value (moved) arguments. Rust likely
won't be able to use it for &mut arguments (tree borrows does not
appear to allow spurious stores).

In this patch the new attribute is only used by LICM scalar promotion.
However, the actual motivation for this is to fix a correctness issue
in call slot optimization, which needs this attribute to avoid
optimization regressions.

Followup to the discussion on D157499.

Differential Revision: https://reviews.llvm.org/D158081
2023-11-01 10:46:31 +01:00
Fangrui Song
2d854dd3e7 Move global namespace cl::opt inside llvm:: or internalize them 2023-10-10 19:58:03 -07:00
Nikita Popov
1b3cc4e715 [ValueTracking] Use SimplifyQuery for the overflow APIs (NFC)
Accept a SimplifyQuery instead of an unpacked list of arguments.
2023-10-10 10:57:49 +02:00
Björn Pettersson
a0ce4384a6
[LICM] Simplify isLoadInvariantInLoop given opaque pointers (#65597)
Since we no longer support typed pointers in LLVM IR, the PtrASXTy
in isLoadInvariantInLoop was set to be equal to Addr->getType() (an
opaque ptr in the same address space). That made the loop looking
through bitcasts redundant.
2023-09-14 16:53:34 +02:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Craig Topper
dc02070d69 [LICM] Check hasNoSignedZeros in hoistFPAssociation.
This matches the check done by the Reassociate pass that we're
trying to reverse.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D158042
2023-08-23 14:05:34 -07:00
Nikita Popov
3670ec2897 [LICM][AA] Move isWritableObject() to AA (NFC)
Move this helper from LICM to AA, so it can be reused.
2023-08-16 14:43:01 +02:00
Bjorn Pettersson
4ce7c4a92a [llvm] Drop some typed pointer handling/bitcasts
Differential Revision: https://reviews.llvm.org/D157016
2023-08-03 22:54:33 +02:00
Paul Osmialowski
8698d56d99 [Transforms][LICM] Add the ability to undo unprofitable reassociation
Consider the following piece of code:

```
void innermost_loop(int i, double d1, double d2, double delta, int n, double cells[n])
{
  int j;
  const double d1d = d1 * delta;
  const double d2d = d2 * delta;

  for (j = 0; j <= i; j++)
    cells[j] = d1d * cells[j + 1] + d2d * cells[j];
}
```

When compiling at -Ofast level, after the "Reassociate expressions"
pass, this code is transformed into an equivalent of:

```
  int j;

  for (j = 0; j <= i; j++)
    cells[j] = (d1 * cells[j + 1] + d2 * cells[j]) * delta;
```

Effectively, the computation of those loop invariants isn't done
before the loop anymore, we have one extra multiplication on each
loop iteration instead. Sadly, this results in a significant
performance hit.

Similarly, specifically crafted user code will also experience
inability to hoist those invariants.

This patch is solving this issue by adding the ability to undo such
reassociation into the LICM pass. Note that for doing such
transformation this pass requires the same conditions as the
"Reassociate expressions" pass, namely, the involved binary operators
must have the reassociations allowed (e.g. by specifying the `fast`
attribute) and they must have single use only.

Some parts of this patch were suggested by Nikita Popov.

Reviewed By: huntergr, nikic, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D152281
2023-08-01 16:42:01 +01:00
Carlos Alberto Enciso
c0a986a60f [LICM] Sunk instructions with invalid source location.
Building the given test case with 'clang -O2 -g' the call to
'getInOrder' is sunk out of the loop by LICM, but the source
location is not dropped.

Reviewed By: aprantl, fdeazeve

Differential Revision: https://reviews.llvm.org/D152691
2023-06-16 06:25:27 +01:00
Kazu Hirata
c7cf942de3 [Scalar] Remove unused function createLICMPass
The last use was removed by:

  commit d623b2f95fd559901f008a0588dddd0949a8db01
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Fri Mar 10 17:24:19 2023 -0800
2023-06-10 21:52:50 -07:00
Chuanqi Xu
84c033d9ba [LICM] [Coroutines] Don't hoist threadlocals within presplit coroutines
Close https://github.com/llvm/llvm-project/issues/63022

This is the following of https://reviews.llvm.org/D135550, which is
discussed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
In my imagination, we could fix the issue fundamentally after we
introduces new memory kind thread id. But I am not very sure if we can
fix the issue fundamentally in time.

Besides that, I think the correctness is the most important. So it
should not be bad to land this given it is innocent.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151774
2023-06-07 10:25:47 +08:00
Nikita Popov
143ed21b26 Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"
This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7.

In preparation for reverting a dependent revision.
2023-06-05 16:45:38 +02:00
Max Kazantsev
dd0cf23e4a [LICM] Reassociate & hoist sub expressions
LICM could reassociate mixed variant/invariant comparison/arithmetic operations
and hoist invariant parts out of loop if it can prove that they can be computed
without overflow. Motivating example here:
```
  INV1 - VAR1 < INV2
```
can be turned into
```
  VAR > INV1 - INV2
```
if we can prove no-signed-overflow here. Then `INV1 - INV2` can be computed
out of loop, so we save one arithmetic operation in-loop.

Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D148001
2023-05-29 12:52:43 +07:00
Max Kazantsev
0d95b20b63 [LICM] Reassociate & hoist add expressions
This patch allows LICM to reassociate and hoist following expressions:
```
loop:
  %sum = add nsw %iv, %C1
  %cmp = icmp <signed pred> %sum, C2
```
where `C1` and `C2` are loop invariants. The reassociated version looks like
```
preheader:
  %inv_sum = C2 - C1
...
loop:
  %cmp = icmp <signed pred> %iv, %inv_sum
```
In order to prove legality, we need both initial addition and the newly created subtraction
to happen without overflow.

Differential Revision: https://reviews.llvm.org/D149132
Reviewed By: skatkov
2023-05-22 13:22:22 +07:00
Christian Ulmann
794b58b467 [IR] Drop const in DILocation::getMergedLocation
This commit removes constness from DILocation::getMergedLocation and
fixes all its users accordingly.

Having constness on the parameters forced the return type to be const
as well, which does force usage of `const_cast` when the location needs
to be used in metadata nodes.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D149942
2023-05-15 07:21:43 +00:00
Nikita Popov
5362a0d859 [LCSSA] Remove unused ScalarEvolution argument (NFC)
After D149435, LCSSA formation no longer needs access to
ScalarEvolution, so remove the argument from the utilities.
2023-05-02 12:17:05 +02:00
Nikita Popov
0659000ff7 [LICM] Don't duplicate instructions just because they're free
D37076 makes LICM duplicate instructions into exit blocks if the
instruction is free. For GEPs, the motivation appears to be that
this allows the GEP to be folded into addressing modes, while
non-foldable users outside the loop might prevent this. TBH I don't
think LICM is the place to do this (why doesn't CGP apply this
heuristic itself?) but at least I understand the motivation.

However, the transform is also applied to all other "free"
instructions, which are just that (removed during lowering and not
"folded" in some way). For such instructions, this transform seems
somewhere between useless, counter-productive (undoing CSE/GVN) and
actively incorrect. For example, this transform can duplicate freeze
instructions, which is illegal.

This patch limits the transform to just foldable GEPs, though we
might want to drop it from LICM entirely as a followup.

This is a small compile-time improvement, because querying TTI cost
model for every single instruction is expensive.

Differential Revision: https://reviews.llvm.org/D149136
2023-04-28 14:31:23 +02:00
Nikita Popov
43436993f4 [LICM] Don't try to constant fold instructions
This was introduced in 030f02021b6359ec5641622cf1aa63d873ecf55a as
an alleged compile-time optimization. In reality, trying to constant
fold instructions is more expensive than just hoisting them. In a
standard pipeline, LICM tends to run either after a run of
LoopInstSimplify or InstCombine, so LICM doesn't really see constant
foldable instructions in the first place, and the attempted fold
is futile.

This makes for a very minor compile-time improvement.

Differential Revision: https://reviews.llvm.org/D149134
2023-04-26 09:26:47 +02:00
Nikita Popov
a1ddfb60da [LICM] Only forget loop/block dispositions
As we are moving the instruction without changing its value, it
is sufficient to only invalidate the loop/block dispositions.
This is the same we do in LoopSink.
2023-04-25 09:58:31 +02:00
Nikita Popov
ebd6b5dc64 [LICM] Minor optimization (NFC)
Simplify the match in hoistMinMax and only fetch the preheader
once.
2023-04-24 17:05:58 +02:00
Nikita Popov
53500e333d Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating
This exposed another miscompile in GVN, which was fixed by
20e9b31f88149a1d5ef78c0be50051e345098e41.

-----

After D141386, violation of nonnull, range and align metadata
results in poison rather than immediate undefined behavior,
which means that these are now safe to retain when speculating.
We only need to remove UB-implying metadata like noundef.

This is done by adding a dropUBImplyingAttrsAndMetadata() helper,
which lists the metadata which is known safe to retain on speculation.

Differential Revision: https://reviews.llvm.org/D146629
2023-04-20 14:17:15 +02:00
Krasimir Georgiev
bf7f6b4436 Revert "Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating"
This reverts commit 6f7e5c0f1ac6cc3349a2e1479ac4208465b272c6.

Seems to expose a miscompile in rust, possibly exposing a bug in LLVM
somewhere. Investigation thread over at:
https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/LLVM.20D146629.20breakage
2023-04-19 08:28:48 +00:00
Nikita Popov
6f7e5c0f1a Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating
This exposed a miscompile in GVN, which was fixed by D148129.

-----

After D141386, violation of nonnull, range and align metadata
results in poison rather than immediate undefined behavior,
which means that these are now safe to retain when speculating.
We only need to remove UB-implying metadata like noundef.

This is done by adding a dropUBImplyingAttrsAndMetadata() helper,
which lists the metadata which is known safe to retain on speculation.

Differential Revision: https://reviews.llvm.org/D146629
2023-04-17 14:15:14 +02:00
Max Kazantsev
a42f589197 [LICM][NFC] Unify arithmetic statistics collection
Avoid divergence b/w different kinds of hoisting with reassociation.
Make them all collect general stat NumHoisted and also specific stats
for each particular transform.
2023-04-11 17:20:02 +07:00
Max Kazantsev
7b8692a55c [LICM][NFC] Do not forward declaration of hoistMinMax
They all are now handled by hoistArithmetics, and only it should be
forwarded.
2023-04-11 17:06:20 +07:00
Nikita Popov
243df834c6 [LICM] Fix assert failure in no-allowspeculation mode
In this case the source GEP might not be hoisted even though it
has invariant operands. For now just bail out, but we might need
additional checks for AllowSpeculation in these special-case
reassociation folds.
2023-04-11 11:55:54 +02:00
Nikita Popov
b8917ac62a [LICM] Reassociate GEPs to allow hoisting
Reassociate gep (gep ptr, idx1), idx2 to gep (gep ptr, idx2), idx1
if this would make the inner GEP loop invariant and thus hoistable.

This is intended to replace an InstCombine fold that does this (in
04f61fb73d/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (L2006)).
The problem with the InstCombine fold is that LoopInfo is an optional
dependency, so it is not performed reliably.

Differential Revision: https://reviews.llvm.org/D146813
2023-04-11 10:34:04 +02:00
Max Kazantsev
cd24665f13 [NFC] Fix typo in statistic description 2023-04-11 14:18:53 +07:00
Max Kazantsev
e5dc4dbe87 [LICM][NFC] Restructure code to have one entry point for reassociation-based hoistings
We already hoist min/max functions and want to do more of this kind. Some
refactoring to make growth points for it.
2023-04-11 14:18:53 +07:00
Nikita Popov
7c78cb4b1f Revert "[SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating"
This reverts commit 78b1fbc63f78660ef10e3ccf0e527c667a563bc8.

This causes or exposes miscompiles in Rust, revert until they
have been investigated.
2023-04-05 17:05:39 +02:00
Nikita Popov
7553bad1ac [LICM] Don't require optimized uses
LICM currently requests optimized use MSSA form. This is wasteful,
because LICM doesn't actually care about most uses, only those of
invariant pointers in loops. Everything else doesn't need to be
optimized.

LICM already uses the clobber walker in most places. This patch
adjusts one place that was using getDefiningAccess() to use it as
well, so we no longer have a dependence on pre-optimized uses.

This change is not NFC in that the fallback on the defining access
when there are too many clobber calls may now fall back to an
unoptimized use. In practice, I've not seen any problems with this
though. If desired, we could also increase licm-mssa-optimization-cap
to a higher value (increasing this from 100 to 200 has no impact on
average compile-time -- but also doesn't appear to have any impact
on LICM quality either).

This makes for a 0.9% geomean compile-time improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D147437
2023-04-05 11:20:25 +02:00
Nikita Popov
78b1fbc63f [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating
After D141386, violation of nonnull, range and align metadata
results in poison rather than immediate undefined behavior,
which means that these are now safe to retain when speculating.
We only need to remove UB-implying metadata like noundef.

This is done by adding a dropUBImplyingAttrsAndMetadata() helper,
which lists the metadata which is known safe to retain on speculation.

Differential Revision: https://reviews.llvm.org/D146629
2023-04-04 10:03:45 +02:00
Nikita Popov
b58a697f3e [LICM] Don't promote store to global even in single-thread mode
Even if there are no thread-safety concerns, we should not promote
(not guaranteed-to-execute) stores to globals without further
analysis: While the global may be writable, we may not have
provenance to perform the write. The @promote_global_noalias test
case illustrates a miscompile in the presence of a noalias pointer
to the global.

Worth noting that the load-only promotion may also not be well-defined
depending on precise semantics (we don't specify whether load
violating noalias is poison or UB -- though I believe the general
inclination is to make it poison, and only stores UB), but that's
a more general issue.

This is inspired by https://github.com/llvm/llvm-project/issues/60860,
which is a related issue with TBAA metadata.

Differential Revision: https://reviews.llvm.org/D146233
2023-04-03 14:20:06 +02:00
Nikita Popov
0b9259c00d [LICM] Extract helper for getClobberingMemoryAccess()
Extract a helper that does the clobber walk while taking into
account the cap. Slightly reflow things to check this first in
the store case, before we start walking over all accesses in the
loop.
2023-04-03 12:02:55 +02:00
Nikita Popov
172094cad3 [LICM] Require MSSA in SinkAndHoistLICMFlags (NFC)
Nowadays MSSA is required for LICM/LoopSink, so drop the checks
for whether its available or not.
2023-03-24 16:09:20 +01:00
Nikita Popov
a5788836b9 [IR] Rename dropUndefImplying to dropUBImplying (NFC)
Clarify that this is only about immediate undefined behavior,
not about undef or poison.
2023-03-22 11:16:22 +01:00
Max Kazantsev
aa485384d7 [LICM] Do not hoist widenable conditions
Despite the fact that it is legal, it is not profitable. It may prevent
Loop Guard Widening to happen. Because of bug described at
https://github.com/llvm/llvm-project/issues/60234, now the guard widening is
only possible when condtion we want to add is available at the point of the
widenable_condition() of dominating guard. It means that, if all such calls are
hoisted out of loop, and the loop conditions depend on loop-variants, we cannot
widen. Hoisting is otherwise not helpful, because it does not introduce any
optimization opportunities.

Differential Revision: https://reviews.llvm.org/D146274
Reviewed By: apilipenko
2023-03-20 12:02:22 +07:00
Max Kazantsev
f91aaf1b0c Return "[LICM] Support logical AND/OR when hoisting min/max"
Underlying bug (creation of umin for pointers) is now fixed.

Differential Revision: https://reviews.llvm.org/D145771
2023-03-13 14:34:43 +07:00
Max Kazantsev
fc128e126b [LICM] Do not hoist min/max for pointer types
umin and similar intrinsics are not defined for them.
2023-03-13 14:12:21 +07:00
Vitaly Buka
f902ead7cb Revert "[LICM] Support logical AND/OR when hoisting min/max"
Breaks https://lab.llvm.org/buildbot/#/builders/37/builds/20720

This reverts commit 9e83d13c9f77e300ebb7b94a1400de3c2d47b3d5.
2023-03-10 09:52:57 -08:00
Nikita Popov
a7322a2171 [LICM] Delay fetching of preheader (NFC)
Only fetch preheader once we want to actually hoist. It turns out
that calculating the preheader is expensive enough to affect
overall compile-time if you do it for every single instruction.

Addresses the compile-time regression from D143726.
2023-03-10 16:16:48 +01:00
Max Kazantsev
9e83d13c9f [LICM] Support logical AND/OR when hoisting min/max
We can handle logical AND/OR in the same way as arithmetic AND/OR, it only
takes us freezing `RHS2` for which we may introduce a new use which didn't
exist before dynamically.

Differential Revision: https://reviews.llvm.org/D145771
Reviewed By: nikic
2023-03-10 18:07:15 +07:00
Max Kazantsev
6b03ce374e [LICM] Simplify (X < A && X < B) into (X < MIN(A, B)) if MIN(A, B) is loop-invariant
We don't do this transform in InstCombine in general case for arbitrary values, because cost of
AND and 2 ICMP's isn't higher than of MIN and ICMP. However, LICM also has a notion
about the loop structure. This transform becomes profitable if `A` and `B` are loop-invariant and
`X` is not: by doing this, we can compute min outside the loop.

Differential Revision: https://reviews.llvm.org/D143726
Reviewed By: nikic
2023-03-10 17:36:52 +07:00
Max Kazantsev
ff687c47b3 [LICM][NFC] Don't preserve DT and loop analyzes separately
This is already implied by getLoopPassPreservedAnalyses.

Differential Revision: https://reviews.llvm.org/D144860
Reviewed By: nikic, skatkov
2023-02-28 17:02:51 +07:00
William S. Moses
58eac856cc [LICM] Ensure LICM can hoist invariant.group
Invariant.group's are not sufficiently handled by LICM. Specifically,
if a given invariant.group loaded pointer is not overwritten between
the start of a loop, and its use in the load, it can be hoisted.
The invariant.group (on an already invariant pointer operand) ensures
the result is the same. If it is not overwritten between the start
of the loop and the load, it is therefore legal to hoist.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144053
2023-02-26 12:41:41 -05:00