84 Commits

Author SHA1 Message Date
Justin Fargnoli
7c8a13ab79
[LoopPeel] change peelLoop's return type from bool to void (#177488) 2026-01-23 10:49:03 -08:00
Alireza Torabian
599c2731b3
[LoopFusion] Forget cached SCEV values after the fusion (#177455)
This patch fixes the issue #115279. After the fusion, some of the cached
SCEV values such as the induction variable may not be valid anymore and
need to be forgotten.
2026-01-22 19:50:21 -05:00
Congzhe
1286de408c
[LoopFusion] Optimize away Phi nodes that are sunk from the 2nd loop preheader (#176503)
Fixed issue #165087.

When we sink phis from the 2nd loop preheader to the exit block, we 
optimize it a bit further, i.e., propagate the uses of each phi node with 
its incoming value and optimize away the phis. Deleted `fixPHINodes()`
too because the phis are already optimized away and there is no point 
processing `fixPHINodes()`.
2026-01-22 16:12:12 -05:00
Alireza Torabian
5ab966aacb
[LoopFusion] Non-loop block must be the immediate successor of exit (#175034)
Loop fusion assumes the non-loop block of a guarded adjacent loop is the
immediate successor of its exit block. This patch ensures this condition
is hold and fixes the crash #166356.
2026-01-08 17:07:15 -05:00
Alireza Torabian
9bc38df587
[LoopFusion] Simplifying the legality checks (#171889)
Considering that the current loop fusion only supports adjacent loops,
we are able to simplify the checks in this pass. By removing
`isControlFlowEquivalent` check, this patch fixes multiple issues
including #166560, #166535, #165031, #80301 and #168263.

Now only the sequential/adjacent candidates are collected in the same
list. This patch is the implementation of approach 2 discussed in post
#171207.
2025-12-12 15:09:34 -05:00
Alireza Torabian
025e431e74
[LoopFusion] Forget loop and block dispositions after latch merge (#166233)
Merging the latches of loops may affect the dispositions, so they should
be forgotten after the merge. This patch fixed the crash in loop fusion
[#164082](https://github.com/llvm/llvm-project/issues/164082).
2025-11-04 16:48:39 -05:00
Rahul Joshi
62adc83c91
[NFC][LLVM] Namespace cleanup in LoopFuse (#163758)
Additionally, make the `Loop` argument to `printLoop` const.
2025-10-16 15:06:05 -07:00
Alireza Torabian
d6072986cd
[LoopFusion] Detecting legal dependencies for fusion using DA info (#146383)
Loop fusion pass will use the information provided by the recent
DA patch to fuse additional legal loops, including those with
forward loop-carried dependencies.
2025-09-25 17:53:26 -04:00
Madhur Amilkanthwar
90de4a4ac9
[LoopFusion] Fix sink instructions (#147501)
If we have instructions in second loop's preheader which can be sunk, we
should also be adjusting PHI nodes to receive values from the fused loop's latch block.

Fixes #128600
2025-07-28 12:08:43 +05:30
Florian Hahn
3fcfce4c5e
Reapply "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts the revert commit bf92b127d2637948f53d11a187e865aa10e2e74c.

This adds missing initialization of PeelLast in gatherPeelingPreferences.

Original message:
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-17 10:51:05 +01:00
Florian Hahn
bf92b127d2
Revert "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts commit bb10c3ba7f77d40a7fbfd4ac815015d3a4ae476a.

Also reverts 4f663cca15f2b53c2bc6a84d1b1f5bd81679356d:
  Revert "[LoopPeel] Make sure PeelLast is always initialized."

Revert for now to bring msan bots back to green

 https://lab.llvm.org/buildbot/#/builders/164/builds/9992
 https://lab.llvm.org/buildbot/#/builders/94/builds/7158
2025-05-16 08:33:12 +01:00
Florian Hahn
bb10c3ba7f
[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-15 19:15:48 +01:00
Pedro Lobo
950bc6cd77
[LoopFuse] Change placeholder from undef to poison (#131535)
Use `poison` instead of `undef` as a placeholder for phi entries of
unreachable predecessors.
2025-03-16 22:44:39 +00:00
Alireza Torabian
3c74430320
[DependenceAnalysis][NFC] Removing PossiblyLoopIndependent parameter (#124615)
Parameter PossiblyLoopIndependent has lost its intended purpose. This
flag is always set to true in all cases when depends() is called, hence
we want to reconsider the utility of this variable and remove it from
the function signature entirely. This is an NFC patch.
2025-02-11 16:23:28 -05:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Vitaly Buka
5ce47a5813
Reland "[Support] Assert that DomTree nodes share parent" (#102782)
A dominance query of a block that is in a different function is
ill-defined, so assert that getNode() is only called for blocks that are
in the same function.

There are three cases, where this behavior did occur. LoopFuse didn't
explicitly do this, but didn't invalidate the SCEV block dispositions,
leaving dangling pointers to free'ed basic blocks behind, causing
use-after-free. We do, however, want to be able to dereference basic
blocks inside the dominator tree, so that we can refer to them by a
number stored inside the basic block.

Reverts #102780
Reland #101198
Fixes #102784

Co-authored-by: Alexis Engelke <engelke@in.tum.de>
2024-08-13 11:56:02 +02:00
Vitaly Buka
3c3df1bef8
Revert "[Support] Assert that DomTree nodes share parent" (#102780)
Reverts llvm/llvm-project#101198

Breaks multiple bots:
https://lab.llvm.org/buildbot/#/builders/72/builds/2103
https://lab.llvm.org/buildbot/#/builders/164/builds/1909
https://lab.llvm.org/buildbot/#/builders/66/builds/2706
2024-08-10 18:36:09 -07:00
Alexis Engelke
8101d1863c
[Support] Assert that DomTree nodes share parent (#101198)
A dominance query of a block that is in a different function is
ill-defined, so assert that getNode() is only called for blocks that are
in the same function.

There are two cases, where this behavior did occur. LoopFuse didn't
explicitly do this, but didn't invalidate the SCEV block dispositions,
leaving dangling pointers to free'ed basic blocks behind, causing
use-after-free. We do, however, want to be able to dereference basic
blocks inside the dominator tree, so that we can refer to them by a
number stored inside the basic block.
2024-08-10 18:19:05 +02:00
Nuno Lopes
7d33d4720f [LoopFuse] Use poison instead of undef as placeholder for phi entry of unreachable predecessor [NFC] 2024-06-30 11:59:58 +01:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00
Fangrui Song
c1eb3db780 [LoopFuse] Remove legacy pass
Following recent changes to remove non-core legacy passes.
2023-02-14 23:53:39 -08:00
Kazu Hirata
b53e0d1b34 Use std::nullopt instead of None in comments (NFC) 2023-01-14 13:53:40 -08:00
Ramkrishnan Narayanan Komala
7f15907acc [LoopFusion] Sorting of undominated FusionCandidates crashes
This patch tries to fix [[ https://github.com/llvm/llvm-project/issues/56263 | issue ]].

If two **FusionCandidates** are in same level of dominator tree then, they will not be dominates each other. But they are control flow equivalent. To sort those FusionCandidates **nonStrictlyPostDominate** check is needed.

Reviewed By: Narutoworld

Differential Revision: https://reviews.llvm.org/D139993
2023-01-11 23:15:40 -05:00
luxufan
aca7441c7a [LoopFusion] Exit early if one of fusion candidate has guarded branch but the another has not
Fixes: https://github.com/llvm/llvm-project/issues/59024

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D138269
2023-01-03 23:18:58 +08:00
Anna Thomas
05b060b0b0 [LoopPeel] Expose ValueMap of last peeled iteration. NFC
The value map of last peeled iteration is computed within peelLoop API.
This patch exposes it for callers of peelLoop.
While this is not currently used by upstream passes, we have a usecase
downstream which benefits from this API update. Future users of peelLoop
can also use the ValueMap if needed.

Similar value maps are exposed by other loop utilities such as loop
cloning.

Differential Revision: https://reviews.llvm.org/D138228
2022-12-19 09:55:29 -05:00
Nikita Popov
04d652994d [SCEV] Return ArrayRef for SCEV operands() (NFC)
Use a consistent type for the operands() methods of different SCEV
types. Also make the API consistent by only providing operands(),
rather than also providin op_begin() and op_end() for some of them.
2022-12-16 15:36:19 +01:00
Joshua Cao
5004320590 [LoopFusion] sink second loop PHIs
Fixes https://github.com/llvm/llvm-project/issues/59023

PHI nodes that are in the second loop only have the first loop as its
predecessor. These PHI nodes should be sunk to the end of the fused
loop. If the second loop uses the PHI, then the loops cannot be fused.

I don't think this should happen in typical compilation workflows.
The PHI will be in a dedicated exit block of the first loop following
LCSSA transformations.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D139812
2022-12-13 10:13:39 -08:00
Fangrui Song
3152156334 [Transforms/Scalar] llvm::Optional => std::optional 2022-12-13 08:05:14 +00:00
Kazu Hirata
343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Mengxuan Cai
ec210f3942 [LoopFuse] Ensure inner loops are in loop simplified form under new PM
LoopInfo doesn't give all loops in a loop nest, it gives top level loops
only. While isLoopSimplifyForm() only checkes for the outter most loop of a
loop nest. As a result, inner loops that are not in simplied form can
not be simplified with the original code.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D137672
2022-11-11 15:55:59 -05:00
Mengxuan Cai
eda3c93486 [LoopFuse] Ensure loops are in loop simplified form under new PM
Loop Fusion (Function Pass) requires loops in simplified form. With
legacy-pm, loop-simplify pass is added as a dependency for loop-fusion.
But the new pass manager does not always ensure this format. This patch
tries to invoke simplifyLoop() on loops that are not in simplified form
only for new PM.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D136781
2022-10-31 11:46:28 -04:00
Max Kazantsev
21a9abc1ce [LoopFuse] Drop loop dispositions before reassigning blocks to other loop
This bug was found by recent improvement in SCEV verifier. The code in LoopFuse
directly reassigns blocks to be a part of a different loop, which should automatically
invalidate all related cached loop dispositions.

Differential Revision: https://reviews.llvm.org/D134173
Reviewed By: nikic
2022-09-19 17:43:06 +07:00
Aaron Kogon
ae05b9dc30 Sink/hoist memory instructions between loop fusion candidates
Currently, instructions in the preheader of the second of two fusion
candidates are sunk and hoisted whenever possible, to try to allow the
loops to fuse. Memory instructions are skipped, and are never sunk or
hoisted. This change adds memory instructions for sinking/hoisting
consideration.

This change uses DependenceAnalysis to check if a mem inst in the
preheader of FC1 depends on an instruction in FC0's header, across
which it will be hoisted, or FC1's header, across which it will be
sunk. We reject cases where the dependency is a data hazard.

Differential Revision: https://reviews.llvm.org/D131606
2022-09-07 07:42:00 -04:00
Kazu Hirata
258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Kazu Hirata
6b1bc80188 [Scalar] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-20 21:18:25 -07:00
Kazu Hirata
0e37ef0186 [Transforms] Fix comment typos (NFC) 2022-08-07 23:55:24 -07:00
Kazu Hirata
a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Aaron Kogon
dd3ca65c37 Sinking or hoisting instructions between loops before fusion
Instructions between two adjacent loops will be hoisted above the first
loop, or sunk below the second to facilitate loop fusion. Hoisting will
be attempted for an instruction that dominates the first loop.
Otherwise, sinking this instructions will be attempted.

Instructions with side effects will not be considered for sinking or
hoisting. Hoisting/sinking of any instructions between loops will only
be performed if all the instructions can be moved. As well,
sinking/hoisting is considered for each instruction in isolation,
without taking into account sinking/hoisting decisions for other
instructions in the preheader.

Differential Revision: https://reviews.llvm.org/D118076
2022-07-27 06:55:09 -04:00
Fangrui Song
95a134254a Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 01:07:51 -07:00
Fangrui Song
36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Anna Thomas
a73e4ce6a5 [LoopFuse] Change DT to reference in FusionCandidate struct. NFC
Assertion added in f50821cff0 confirms that the DT is indeed nonnull.
Change it to a reference instead of a pointer to make this explicit in
FusionCandidate.
Suggested in D118472.
2022-02-02 14:55:37 -05:00
Anna Thomas
f50821cff0 [LoopFuse] Add assertion for non-null DT in fusion candidate
The code paths analyzed (all constructor invocations of fusion
candidate) pass in a non-null DT.
Adding this assert as requested in D118472 before converting this to a
reference argument.
2022-02-01 17:00:09 -05:00
Anna Thomas
bc48a26655 [LoopPeel] Use reference instead of pointer for DT argument
Cleanup code in peelLoop API. We already have usage of DT without guarding
against a null DT, so this change constant folds the remaining null DT
checks.
Also make the argument a reference so that it is clear the argument is
a nonnull DT.
Extracted from D118472.
2022-02-01 17:00:08 -05:00
Fangrui Song
18839be9c5 [ADT] Remove StatisticBase and make NoopStatistic empty
In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic`
but has 3 mostly unused pointers. GlobalOpt considers that the pointers can
potentially retain allocated objects, so GlobalOpt cannot optimize out the
`NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage
2 clang.

This patch makes `NoopStatistic` empty and thus reclaims the wasted space.  The
clang size is even smaller than applying D69428 (slightly smaller in both .bss and
.text).
```
# This means the D69428 optimization on clang is mostly nullified by this patch.
HEAD+D69428: size(.bss) = 0x0725a8
HEAD+D101211: size(.bss) = 0x072238

# bloaty - HEAD+D69428 vs HEAD+D101211
# With D101211, we also save a lot of string table space (.rodata).
    FILE SIZE        VM SIZE
 --------------  --------------
  -0.0%     -32  -0.0%     -24    .eh_frame
  -0.0%    -336  [ = ]       0    .symtab
  -0.0%    -360  [ = ]       0    .strtab
  [ = ]       0  -0.2%    -880    .bss
  -0.0% -2.11Ki  -0.0% -2.11Ki    .rodata
  -0.0% -2.89Ki  -0.0% -2.89Ki    .text
  -0.0% -5.71Ki  -0.0% -5.88Ki    TOTAL
```

Note: LoopFuse is a disabled pass. For now this patch adds
`#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in
LLVM_ENABLE_STATS==0 builds.  If these `OptimizationRemarkMissed` are useful in
LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with
`llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings.

Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which
calls `getName`/`getDesc`/`getValue`.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D101211
2021-04-26 16:47:32 -07:00
Lei Zhang
254e289d45 Revert "[ADT] Remove StatisticBase and make NoopStatistic empty"
This reverts commit b5403117814a7c39b944839e10492493f2ceb4ac
because it breaks MLIR build:

https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b
2021-04-26 18:31:04 -04:00
Fangrui Song
b540311781 [ADT] Remove StatisticBase and make NoopStatistic empty
In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic`
but has 3 unused pointers. GlobalOpt considers that the pointers can potentially
retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic`
variables (see D69428 for more context), wasting 23KiB for stage 2 clang.

This patch makes `NoopStatistic` empty and thus reclaims the wasted space.  The
clang size is even smaller than applying D69428 (slightly smaller in both .bss and
.text).
```
# This means the D69428 optimization on clang is mostly nullified by this patch.
HEAD+D69428: size(.bss) = 0x0725a8
HEAD+D101211: size(.bss) = 0x072238

# bloaty - HEAD+D69428 vs HEAD+D101211
# With D101211, we also save a lot of string table space (.rodata).
    FILE SIZE        VM SIZE
 --------------  --------------
  -0.0%     -32  -0.0%     -24    .eh_frame
  -0.0%    -336  [ = ]       0    .symtab
  -0.0%    -360  [ = ]       0    .strtab
  [ = ]       0  -0.2%    -880    .bss
  -0.0% -2.11Ki  -0.0% -2.11Ki    .rodata
  -0.0% -2.89Ki  -0.0% -2.89Ki    .text
  -0.0% -5.71Ki  -0.0% -5.88Ki    TOTAL
```

Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so
`OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds.  If these
`OptimizationRemarkMissed` are useful and not noisy, we can replace
`llvm::Statistic` with `llvm::TrackingStatistic` in the future.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D101211
2021-04-26 13:39:35 -07:00