312 Commits

Author SHA1 Message Date
Romanov Vlad
a704195054
[AMDGPU] Shrink S_MOV_B64 to S_MOV_B32 during rematerialization (#184333)
When rematerializing S_MOV_B64 or S_MOV_B64_IMM_PSEUDO and only a single
32-bit lane of the result is used at the remat point, emit S_MOV_B32
with the appropriate half of the 64-bit immediate instead.

This reduces register pressure by defining a 32-bit register instead of
a 64-bit pair when the other half is unused.
2026-03-20 15:09:13 +01:00
Min-Yih Hsu
75f03a62d1
[InlineSpiller] Hoist spills only when all of its subranges are available (#177703)
Context:
https://github.com/llvm/llvm-project/issues/176001#issuecomment-3793846479

When we're hoisting a spill to another block, and storing one of its
sibling values there instead, we need to make sure all sub ranges of
that sibling value is alive at the insertion point. Otherwise, we might
accidentally store compromised subranges values back to the spill slot.
2026-01-29 11:27:02 -08:00
Valery Pykhtin
f32c00f6d8
[regalloc][LiveRegMatrix][AMDGPU] Fix LiveInterval dangling pointers in LiveRegMatrix. (#168556)
This patch correctly removes segments from LiveRegMatrix that reference LiveIntervals removed after spilling. Added validity check that LiveRegMatrix doesn't contain invalid references to intervals.
2026-01-27 16:08:58 +01:00
Matt Arsenault
1f3f522866
CodeGen: Remove TRI arguments from stack load/store hooks (#158240)
This is directly available in TargetInstrInfo
2025-11-10 16:24:39 -08:00
Luke Lau
6f147445f7
[RegAlloc] Constrain rematted regclass to use (#164386)
When rematting we create a new virtual register with the original def's
register class. However the use may have a different register class if
the interval is split, which means we end up with an invalid register
class.

This fixes #164181 by constraining the newly created register to the
use's register class.

The test case is reduced as far as it goes. Because this test requires
us to reach a certain amount of register pressure in certain conditions
I'm not sure if there's an easy way to handwrite this scenario.
2025-10-23 03:05:39 +00:00
Philip Reames
6ae6583089
[CodeGen] Finish untangling LRE::scanRemattable [nfc] (#161963)
This is an attempt to simplify the rematerialization logic in
InlineSpiller and SplitKit. I'd earlier done the same for
RegisterCoalescer in 57b673.

The basic idea of this change is that we don't need to check whether an
instruction is rematerializable early. Instead, we can defer the check
to the point where we're actually trying to materialize something. We
also don't need to indirect that query through a VNI key, and can
instead just check the instruction directly at the use site.
2025-10-07 07:19:58 -07:00
Philip Reames
78da592647
[RegAlloc] Add additional tracing in InlineSpiller::rematerializeFor (#160761)
We didn't have trace logging for two cases in this routine which makes
it sometimes hard to tell what is going on. In addition to debug trace
statements, add comments to explain the logic behind the early exits
which don't mark the virtual register live. Suggestions on how to word
these more precisely very welcome; I'm not clear I understand all the
intrinicies of this code myself.
2025-09-26 06:54:11 -07:00
csstormq
d14aa0cd46
[InlineSpiller] Drop unused elements in Virt2SiblingsMap. NFC (#147866) 2025-07-10 21:35:44 +08:00
Kazu Hirata
3bc174ba77
[CodeGen] Remove unused includes (NFC) (#141320)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 00:00:00 -07:00
weiguozhi
b25b51eb63
[InlineSpiller] Check rematerialization before folding operand (#134015)
Current implementation tries to fold the operand before
rematerialization because it can reduce one register usage. But if there
is a physical register available we can still rematerialize it without
causing high register pressure.

This patch do this check to find the better choice. Then we can produce

    xorps %xmm1, %xmm1
    ucomiss %xmm1, %xmm0

instead of 

    ucomiss LCPI0_1(%rip), %xmm0
2025-04-28 09:52:03 -07:00
Kazu Hirata
599005686a
[llvm] Use *Set::insert_range (NFC) (#132325)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-20 22:24:06 -07:00
Philip Reames
bdb4012fe3 [CodeGen] Remove parameter from LiveRangeEdit::canRematerializeAt [NFC]
Only one caller cares about the true case of this parameter, so move
the check to that single caller.  Note that RegisterCoalescer seems
like it should care, but it already duplicates the check several
lines above.
2025-03-14 09:12:07 -07:00
Craig Topper
3fe22559c7 [InlineSpiller] Use Register. NFC 2025-03-02 23:46:17 -08:00
Kazu Hirata
34172bba11
[CodeGen] Avoid repeated hash lookups (NFC) (#128300) 2025-02-22 02:24:35 -08:00
Craig Topper
92ddbbd89f [CodeGen] Remove static member functions Register::stackSlot2Index/isStackSlot. NFC
Migrate the few users to the nonstatic member functions.
2025-02-19 21:54:43 -08:00
Kazu Hirata
a5e969a82b
[CodeGen] Avoid repeated hash lookups (NFC) (#125382) 2025-02-02 09:31:55 -08:00
Balazs Benics
16d4453f2f
[CodeGen][NFC] Remove redundant map lookup (#125342) 2025-02-01 15:29:33 +01:00
Kazu Hirata
850852e9a4
[CodeGen] Avoid repeated hash lookups (NFC) (#124455) 2025-01-26 01:35:39 -08:00
Daniel Paoliello
19032bfe87
[aarch64][win] Update Called Globals info when updating Call Site info (#122762)
Fixes the "use after poison" issue introduced by #121516 (see
<https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>).

The root cause of this issue is that #121516 introduced "Called Global"
information for call instructions modeling how "Call Site" info is
stored in the machine function, HOWEVER it didn't copy the
copy/move/erase operations for call site information.

The fix is to rename and update the existing copy/move/erase functions
so they also take care of Called Global info.
2025-01-13 14:00:31 -08:00
Akshat Oke
4f96fb5fb3
Reapply "Spiller: Detach legacy pass and supply analyses instead (#119181)" (#122665)
Makes Inline Spiller amenable to the new PM.

This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted
because of two unused private members reported on sanitizer bots.
2025-01-13 14:14:13 +05:30
Akshat Oke
089555095b
Revert "Spiller: Detach legacy pass and supply analyses instead (#119… (#122426)
…181)"

This reverts commit a531800344dc54e9c197a13b22e013f919f3f5e1.
2025-01-10 12:23:07 +05:30
Akshat Oke
a531800344
Spiller: Detach legacy pass and supply analyses instead (#119181)
Makes Inline Spiller amenable to the new PM.
2025-01-10 11:46:56 +05:30
Akshat Oke
2c7ece2e8c
[CodeGen][NewPM] Port LiveStacks analysis to NPM (#118778) 2024-12-06 15:16:07 +05:30
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Bevin Hansson
1a65d95d00
[CodeGen][RAGreedy] Inform LiveDebugVariables about snippets spilled by InlineSpiller. (#109962)
RAGreedy invokes InlineSpiller to spill a particular virtreg inline.
When the spiller does this, it also identifies small, adjacent liveranges called
snippets. These are also spilled or rematerialized in the process.

However, the spiller does not inform RA that it has spilled these regs.
This means that debug variable locations referencing these regs/ranges
are lost.

Mark any spilled regs which do not have a stack slot assigned to them as
allocated to the slot being spilled to to tell LDV that those regs are
located in that slot, even though the regs might no longer exist in the
program after regalloc is finished. Also, inform RA about all of the
regs which were replaced (spilled or rematted), not just the one that was
requested so that it can properly manage the ranges of the debug vars.
2024-10-02 10:29:56 +02:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
paperchalice
099899961c
[CodeGen][NewPM] Port machine-block-freq to new pass manager (#98317)
- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager.
- `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new
pass manager migration.
2024-07-12 15:45:01 +08:00
paperchalice
abde52aa66
[CodeGen][NewPM] Port LiveIntervals to new pass manager (#98118)
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.

This would be the last analysis required by `PHIElimination`.
2024-07-10 19:34:48 +08:00
Aiden Grossman
c791d86eab [NFC][RegAlloc] Delete unused option
The option -disable-spill-hoist does not actually control anything and
is not used anywhere, so it should be removed.
2024-06-27 01:57:56 +00:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Kazu Hirata
92c2529ccd [llvm] Stop including vector (NFC)
Identified with clangd.
2023-12-03 22:32:21 -08:00
Matthias Braun
5353d3f509
Remove unused LoopInfo from InlineSpiller and SpillPlacement (NFC) (#71874) 2023-11-16 21:20:30 -08:00
Christudasan Devadasan
ce7fd498ed
[AMDGPU] RA inserted scalar instructions can be at the BB top (#72140)
We adjust the insertion point at the BB top for spills/copies during RA
to ensure they are placed after the exec restore instructions required
for the divergent control flow execution. This is, however, required
only for the vector operations. The insertions for scalar registers can
still go to the BB top.
2023-11-16 10:30:03 +05:30
Piotr Sobczak
80abbeca8e [Inline Spiller] Consider bundles when marking defs as dead
Fix bug where the code expects just a single MI, but a series of
bundled MIs need to be handled instead.

The semi-formed bundled are created by SplitKit for the case where
not all lanes are live (buildSingleSubRegCopy). Then the remat
kicks in, and since the values that are copied in the bundle do not
need to be preserved due to the remat (dead defs), all instructions
in the bundle should be marked as dead.

However, only the first one gets marked as dead, which causes the
verifier to complain later with error: "Live range continues after
dead def flag".

Differential Revision: https://reviews.llvm.org/D156999
2023-10-26 11:52:55 +02:00
Matt Arsenault
3e49ce6ea1
InlineSpiller: Delete assert that implicit_def has no implicit operands (#69087)
It's not a verifier enforced property that implicit_def may only have
one operand. Fixes assertions after the coalescer implicit-defs to
preserve super register liveness to arbitrary instructions.

For some reason I'm unable to reproduce this as a MIR test running only
the allocator for the x86 test. Not sure it's worth keeping around.
2023-10-20 00:51:12 +09:00
JP Lehr
e816c89c84 Revert "InlineSpiller: Consider if all subranges are the same when avoiding redundant spills"
This reverts commit d8127b2ba8a87a610851b9a462f2fc2526c36e37.
2023-10-02 06:26:33 -05:00
Matt Arsenault
d8127b2ba8 InlineSpiller: Consider if all subranges are the same when avoiding redundant spills
This avoids some redundant spills of subranges, and avoids a compile failure.
This greatly reduces the numbers of spills in a loop.

The main range is not informative when multiple instructions are needed to fully define
a register. A common scenario is a lowered reg_sequence where every subregister
is sequentially defined, but each def changes the main range's value number. If
we look at specific lanes at the use index, we can see the value is actually the
same.

In this testcase, there are a large number of materialized 64-bit constant defs
which are hoisted outside of the loop by MachineLICM. These are feeding REG_SEQUENCES,
which is not considered rematerializable inside the loop. After coalescing, the split
constant defs produce main ranges with an apparent phi def. There's no phi def if you look
at each individual subrange, and only half of the register is really redefined to a constant.

Fixes: SWDEV-380865

https://reviews.llvm.org/D147079
2023-10-01 11:37:53 +03:00
Matt Arsenault
4d42e8b5d1 Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.

The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work
around the underlying problem with SUBREG_TO_REG.
2023-07-31 20:15:45 -04:00
Vitaly Buka
a496c8be6e Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381
2023-07-26 22:13:32 -07:00
Matt Arsenault
825b7f0ca5 InlineSpiller: Fix copy identification bugs in isCopyOfBundle
Noticed by inspection of
b7836d856206ec39509d42529f958c920368166b. This was checking if the
first instruction was a copy, not the current MI. It should fully
respect the isCopyInstr result. Hopefully this fixes a reported
regression which we can extract a test from.
2023-07-17 20:05:56 -04:00
Yashwant Singh
b7836d8562 [CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
 splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388
2023-07-07 22:29:50 +05:30
Matt Arsenault
7dcb9c0f09 InlineSpiller: Consider copy bundles when looking for snippet copies
This was looking for full copies produced by SplitKit, but SplitKit
introduces copy bundles if not all lanes are live. The scan for uses
needs to look at bundles, not individual instructions.

This is a prerequisite to avoiding some redundant spills due to
subregisters which will help avoid an allocation failure in a future
patch.
2023-06-20 12:26:27 -04:00
Jay Foad
5022fc2ad3 [CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.
Differential Revision: https://reviews.llvm.org/D151424
2023-06-01 19:17:34 +01:00
Akshay Khadse
8bf7f86d79 Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303
2023-04-17 16:32:46 +08:00
Kazu Hirata
a585fa2637 [CodeGen] Use *{Set,Map}::contains (NFC) 2023-03-14 08:07:42 -07:00
Craig Topper
e72ca520bb [CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715
2023-01-13 14:38:08 -08:00
Serguei Katkov
fd64bd94ed [Inline Spiller] Extend the snippet by statepoint uses
Snippet is a tiny live interval which has copy or fill like def
and copy or spill like use at the end (any of them might abcent).

Snippet has only one use/def inside interval and interval is located
in one basic block.

When inline spiller spills some reg around uses it also forces the
spilling of connected snippets those which got by splitting the
same original reg and its def is a full copy of our reg or its
last use is a full copy to our reg.

The definition of snippet is extended to allow not only one use/def
but more. However all other uses are statepoint instructions which will
fold fill into its operand. That way we do not introduce new fills/spills.

Reviewed By: qcolombet, dantrushin
Differential Revision: https://reviews.llvm.org/D138093
2023-01-09 13:30:57 +07:00
Christudasan Devadasan
b5efec4b27 [CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot
With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138656
2022-12-17 11:55:34 +05:30
Kazu Hirata
405fc404bf [ADT] Don't including None.h (NFC)
These source files no longer use None, so they do not need to include
None.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-06 20:14:51 -08:00