300 Commits

Author SHA1 Message Date
Rahul Joshi
1fdf02ad5a
[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002)
Add per-property has<Prop>/set<Prop>/reset<Prop> functions to
MachineFunctionProperties.
2025-05-22 08:07:52 -07:00
Kazu Hirata
58774f1b1f
[CodeGen] Construct SmallVector with iterator ranges (NFC) (#136258) 2025-04-18 10:26:48 -07:00
Antonio Frighetto
ade2276517 [RegAllocFast] Ensure live-in vregs get reloaded after INLINEASM_BR spills
We have already ensured in 9cec2b246e719533723562950e56c292fe5dd5ad
that `INLINEASM_BR` output operands get spilled onto the stack, both
in the fallthrough path and in the indirect targets. Since reloads of
live-ins values into physical registers contextually happen after all
MIR instructions (and ops) have been visited, make sure such loads are
placed at the start of the block, but after prologues or `INLINEASM_BR`
spills, as otherwise this may cause stale values to be read from the
stack.

Fixes: #74483, #110251.
2025-03-24 09:19:53 +01:00
Craig Topper
e56215d17c [RegAllocFast] Use Register and MCRegister. NFC 2025-03-02 22:33:25 -08:00
Benjamin Kramer
2c1df22061 RegAllocFast: Fix 8634635d689c5a7adfb19cde4a313d7c02e95194 to not trip assertions 2025-02-26 15:37:46 +01:00
Benjamin Kramer
8634635d68 RegAllocFast: Stop reading uninitalized memory
Found by msan.
==8138==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x559016395beb in allocVirtRegUndef llvm/lib/CodeGen/RegAllocFast.cpp:1010:6
2025-02-26 15:25:55 +01:00
Matt Arsenault
75aff78f64
RegAllocFast: Fix verifier errors after assigning to reserved registers (#128281) 2025-02-26 13:22:53 +07:00
Christopher Di Bella
309e3ca081 Revert "[CodeGen] Remove static member function Register::isPhysicalRegister. NFC"
This reverts commit 5fadb3d680909ab30b37eb559f80046b5a17045e.
2025-02-20 22:06:21 +00:00
Craig Topper
5fadb3d680 [CodeGen] Remove static member function Register::isPhysicalRegister. NFC
Prefer the nonstatic member by converting unsigned to Register instead.
2025-02-20 10:49:53 -08:00
Craig Topper
473953a15f
[CodeGen] Use non-static Register::virtRegIndex() instead of static Register::virtReg2Index. NFC (#125031)
These are the the ones where we already had a Register object being
used. Some places are still using unsigned which I did not convert.
2025-01-30 00:14:08 -08:00
Craig Topper
f5f32cef61
[CodeGen] Use MCRegister instead of MCPhysReg in RegisterMaskPair. NFC (#123688)
Update some other places to avoid implicit conversions this introduces,
but I probably missed some.
2025-01-21 07:04:35 -08:00
Craig Topper
c3d820553f
[RegAllocFast] Don't convert MCRegUnit to MCRegister. NFC (#123705) 2025-01-21 07:03:23 -08:00
Thurston Dang
e8a6563768
Fix-forward 'RegAllocFast: Avoid using temporary DiagnosticInfo #120184' (#120268)
There was a buildbot breakage

(https://lab.llvm.org/buildbot/#/builders/24/builds/3329/steps/11/logs/stdio):


/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/AMDGPU/ran-out-of-registers-error-all-regs-reserved.ll:9:10:
error: CHECK: expected string not found in input
; CHECK: error: <unknown>:0:0: no registers from class available to
allocate in function 'no_registers_from_class_available_to_allocate'

2: ==75198==ERROR: AddressSanitizer: stack-use-after-scope on address
0xfa23f9f1c270 at pc 0xb2660dda9340 bp 0xfffffe8ab340 sp 0xfffffe8ab338

caused by https://github.com/llvm/llvm-project/pull/120184, which made a
partial fix but also renabled the tests. This patch attempts to fix
forward by applying the same fix to the error message highlighted in the
buildbot.
2024-12-17 09:09:13 -08:00
Matt Arsenault
3508d8f6dd
RegAllocFast: Avoid using temporary DiagnosticInfo (#120184)
This reverts commit 1297933f35b4948b4d281259627a72094c407a75.
2024-12-17 16:19:26 +07:00
Matt Arsenault
818bffcb1c
RegAlloc: Fix failure on undef use when all registers are reserved (#119647)
Greedy and fast would hit different assertions on undef uses if all
registers in a class were reserved.
2024-12-16 10:56:45 +09:00
Matt Arsenault
61f99a1c75
RegAlloc: Do not fatal error if there are no registers in the alloc order (#119640)
Try to use DiagnosticInfo if every register in the class is reserved
by forcing assignment to a reserved register. Also reduces the number
of redundant errors emitted, particularly with fast.

This is still broken in the case of undef uses. There are additional
complications in greedy and fast, so leave it for a separate fix.
2024-12-16 10:52:49 +09:00
Matt Arsenault
bb18e49edb
RegAlloc: Use DiagnosticInfo to report register allocation failures (#119492)
Improve the non-fatal cases to use DiagnosticInfo, which will now
provide a location. The allocators attempt to report different errors
if it happens to see inline assembly is involved (this detection is
quite unreliable) using srcloc instead of dbgloc. For now, leave this
behavior unchanged. I think reporting the full location and context
function would be more useful.
2024-12-16 10:49:08 +09:00
Matt Arsenault
ea632e1b34
Reapply "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575) (#119634)
This reverts commit 40986feda8b1437ed475b144d5b9a208b008782a.

Reapply with fix to prevent temporary Twine from going out of scope.
2024-12-11 16:01:48 -08:00
Vitaly Buka
40986feda8
Revert "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575)
Reverts llvm/llvm-project#119485

Breaks builders, details in llvm/llvm-project#119485
2024-12-11 07:51:36 -08:00
Matt Arsenault
884f2ad6f9
DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm (#119485)
Currently LLVMContext::emitError emits any error as an "inline asm"
error which does not make any sense. InlineAsm appears to be special,
in that it uses a "LocCookie" from srcloc metadata, which looks like
a parallel mechanism to ordinary source line locations. This meant
that other types of failures had degraded source information reported
when available.

Introduce some new generic error types, and only use inline asm
in the appropriate contexts. The DiagnosticInfo types are still
a bit of a mess, and I'm not sure why DiagnosticInfoWithLocationBase
exists instead of just having an optional DiagnosticLocation in the
base class.

DK_Generic is for any error that derives from an IR level instruction,
and thus can pull debug locations directly from it. DK_GenericWithLoc
is functionally the generic codegen error, since it does not depend
on the IR and instead can construct a DiagnosticLocation from the
MI debug location.
2024-12-11 17:16:07 +09:00
Vitaly Buka
0281339159
Revert "[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC)" (#106451)
Reverts llvm/llvm-project#106404

Breaks:
https://lab.llvm.org/buildbot/#/builders/169/builds/2590
https://lab.llvm.org/buildbot/#/builders/164/builds/2454
2024-08-28 13:40:34 -07:00
Kazu Hirata
a4989cd603
[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC) (#106404) 2024-08-28 11:07:31 -07:00
Kazu Hirata
399d7cce37
[CodeGen] Use MachineInstr::all_defs (NFC) (#106017) 2024-08-26 07:22:17 -07:00
Pratyay Pande
3e806c827e
[NFC] Use references to avoid copying (#99863)
Modifying `auto` to `auto&` to avoid unnecessary copying
2024-08-09 20:33:05 +08:00
Christudasan Devadasan
15b41d207e
[CodeGen] change prototype of regalloc filter function (#93525)
[CodeGen] Change the prototype of regalloc filter function

Change the prototype of the filter function so that we can
filter not just by RegClass. We need to implement more
complicated filter based upon some other info associated
with each register.

Patch provided by: Gang Chen (gangc@amd.com)
2024-07-22 16:49:39 +05:30
paperchalice
8e9c6bfb50
[CodeGen][NewPM] Extract MachineFunctionProperties modification part to an RAII class (#94854)
Modify MachineFunctionProperties in PassModel makes `PassT P;
P.run(...);` not work properly. This is a necessary compromise.
2024-06-22 17:34:03 +08:00
Alexis Engelke
f4cf15d225
[RegAllocFast] Replace UsedInInstr with vector (#96323)
A SparseSet adds an avoidable layer of indirection and possibly looping
control flow. Avoid this overhead by using a vector to store
UsedInInstrs and PhysRegUses.

To avoid clearing the vector after every instruction, use a
monotonically increasing counter. The two maps are now merged and the
lowest bit indicates whether the use is relevant for the livethrough
handling code only.
2024-06-21 19:35:29 +02:00
Alexis Engelke
739a960567
[RegAlloc] Don't call always-true ShouldAllocClass (#96296)
Previously, there was at least one virtual function call for every
allocated register. The only users of this feature are AMDGPU and RISC-V
(RVV), other targets don't use this. To easily identify these cases,
change the default functor to nullptr and don't call it for every
allocated register.
2024-06-21 13:18:35 +02:00
Alexis Engelke
0ae6cfc599
[RegAllocFast] Handle single-vdef instrs faster (#96284)
On x86, many instructions have tied operands, so allocateInstruction
uses the more complex assignment strategy, which computes the assignment
order of virtual defs first. This involves iterating over all register
classes (or register aliases for physical defs) to compute the possible
number of defs per register class.

However, this information is only used for sorting virtual defs and
therefore not required when there's only one virtual def -- which is a
very common case. As iterating over all register classes/aliases is not
cheap, do this only when there's more than one virtual def.
2024-06-21 12:30:59 +02:00
Alexis Engelke
cba4dfdd2f [RegAllocFast] Use unsigned for operand indices
MachineInstr operand indices can be up 24 bits currently. Use unsigned
as consistent data type for operand indices instead of uint16_t.
2024-06-21 10:25:28 +00:00
paperchalice
1bc8b3258e
[NewPM][CodeGen] Port regallocfast to new pass manager (#94426)
This pull request port `regallocfast` to new pass manager. It exposes
the parameter `filter` to handle different register classes for AMDGPU.
IIUC AMDGPU need to allocate different register classes separately so it
need implement its own `--<reg-class>-regalloc`. Now users can use e.g.
`-passe=regallocfast<filter=sgpr>` to allocate specific register class.
The command line option `--regalloc-npm` is still in work progress, plan
to reuse the syntax of passes, e.g. use
`--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to
replace `--sgpr-regalloc` and `--vgpr-regalloc`.
2024-06-07 12:22:42 +08:00
Jay Foad
63a5dc4aed
[CodeGen] Do not pass MF into MachineRegisterInfo methods. NFC. (#84770)
MachineRegisterInfo already knows the MF so there is no need to pass it
in as an argument.
2024-03-11 15:35:05 +00:00
HaohaiWen
536b043219
[RegAllocFast] Lazily initialize InstrPosIndexes for each MBB (#76275)
Most basic block do not need to query dominates. Defer initialization of
InstrPosIndexes to first query for each MBB.
2023-12-25 09:42:31 +08:00
Nikita Popov
d82eccc752 [RegAllocFast] Avoid duplicate hash lookup (NFC) 2023-12-22 16:52:20 +01:00
HaohaiWen
40ec791b15
[RegAllocFast] Refactor dominates algorithm for large basic block (#72250)
The original brute force dominates algorithm is O(n) complexity so it is
very slow for very large machine basic block which is very common with
O0. This patch added InstrPosIndexes to assign index for each
instruction and use it to determine dominance. The complexity is now
O(1).
2023-12-22 23:06:16 +08:00
Nick Desaulniers
935c6a2d8d
[RegAllocFast] NFC cleanups (#74860)
- use more range for
- avoid capturing lambda
- prefer Register type to unsigned
- remove braces around single statement if
2023-12-12 08:58:58 -08:00
HaohaiWen
a908920201
[NFC][CodeGen] clang-format RegAllocFast.cpp (#72199) 2023-11-14 12:57:02 +08:00
Elliot Goodrich
4d0f1e3282 [llvm] Remove SmallSet from MachineInstr.h
`MachineInstr.h` is a commonly included file and this includes
`llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is
used only in one place.

According to `ClangBuildAnalyzer` (run solely on building LLVM, no other
projects) the second most expensive template to instantiate is the
`SmallSet::insert` method used in the `inline` implementation in
`getUsedDebugRegs()`:

```
**** Templates that took longest to instantiate:
554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms)
521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560
       ms)
...
```

By removing this method and putting its implementation in the one call
site we greatly reduce the template instantiation time and reduce the
number of includes.

When copying the implementation, I removed a check on `MO.getReg()` as
this is checked within `MO.isVirtual()`.

Differential Revision: https://reviews.llvm.org/D157720
2023-08-12 18:15:27 +01:00
Qi Hu
ddd7d35c6c [RegAlloc] Fix assertion failure caused by inline assembly
When inline assembly code requests more registers than available, the
MachineInstr::emitError function in the RegAllocFast pass emits an error
but doesn't stop the pass, and then the compiler crashes later with an
assertion failure. This commit, mimicking the RegAllocGreedy pass, assigns
a random physical register, and therefore avoids the crash after producing
the diagnostic. This problem has been observed for both rustc and clang,
while it doesn't occur in gcc.
2023-07-25 19:21:03 -04:00
Jay Foad
da7892f729 [MC] Use regunits instead of MCRegUnitIterator. NFC.
Differential Revision: https://reviews.llvm.org/D153122
2023-06-16 12:21:32 +01:00
Sergei Barannikov
aa2d0fbc30 [MC] Add MCRegisterInfo::regunits for iteration over register units
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D152098
2023-06-16 05:39:50 +03:00
Jay Foad
5022fc2ad3 [CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.
Differential Revision: https://reviews.llvm.org/D151424
2023-06-01 19:17:34 +01:00
Gaëtan Bossu
c4a872badb FastRegAlloc: Fix implicit operands not rewritten
This patch fixes a potential crash due to RegAllocFast not rewriting virtual
registers. This essentially happens because of a call to
MachineInstr::addRegisterKilled() in the process of allocating a "killed" vreg.
The former can eventually delete implicit operands without RegAllocFast
noticing, leading to some operands being "skipped" and not rewritten to use
physical registers.

Note that I noticed this crash when working on a solution for tying a register
with one/multiple of its sub-registers within an instruction. (See problem
description here:
https://discourse.llvm.org/t/pass-to-tie-an-output-operand-to-a-subregister-of-an-input-operand/67184).

Aside from this fix, I believe there could be further improvements to the
RegAllocFast when it comes to instructions with multiple uses of a same virtual
register. You can see it in the added test where the implicit uses have been
re-written in a somewhat surprising way because of phase ordering. Ultimately,
when allocating vregs for an instruction, I believe we should iterate on the
vregs it uses (and then process all the operands that use this vregs), instead
of directly iterating on operands and somewhat assuming each operand uses a
different vreg. This would in the end be quite close to what
greedy+virtregrewriter does. If that makes sense, I would probably spin off
another patch (after I get more familiar with RegAllocFast).

Differential Revision: https://reviews.llvm.org/D145169
2023-05-16 09:49:20 +02:00
Alexis Engelke
1e743732e7 [RegAllocFast] Use uint16_t SparseT for LiveRegMap
For functions with very large numbers of live variables, lookups into
LiveRegMap previously detoriated to linear searches.

This slightly increases memory usage, but that is barely measurable.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D149330
2023-04-27 18:58:49 +02:00
Akshay Khadse
8bf7f86d79 Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303
2023-04-17 16:32:46 +08:00
Matt Arsenault
7907fd4961 RegAllocFast: Fix dropping subreg indexes on unassigned subreg defs
This was assuming all register operands were assigned to physical registers.
This should ignore the operands which weren't assigned in this run.

Fixes #61134
2023-04-05 18:25:51 -04:00
Nick Desaulniers
9cec2b246e [RegAllocFast] insert additional spills along indirect edges of INLINEASM_BR
When generating spills (stores) for values produced by INLINEASM_BR
instructions, make sure to insert one spill per indirect target.
Otherwise the reload generated may load from a stack slot that has not
yet been stored to (resulting in a load of an uninitialized stack slot).

Link: https://github.com/llvm/llvm-project/issues/53562
Fixes: https://github.com/llvm/llvm-project/issues/60855

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D144907
2023-03-01 15:21:11 -08:00
Craig Topper
e72ca520bb [CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715
2023-01-13 14:38:08 -08:00
Josh Stone
87f57f459e [RegAllocFast] Handle new debug values for spills
These new debug values get inserted after the place where the spill
happens, which means they won't be reached by the reverse traversal of
basic block instructions. This would crash or fail assertions if they
contained any virtual registers to be replaced. We can manually handle
the new debug values right away to resolve this.

Fixes https://github.com/llvm/llvm-project/issues/59172

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D139590
2023-01-05 20:41:11 -08:00
Christudasan Devadasan
b5efec4b27 [CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot
With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138656
2022-12-17 11:55:34 +05:30