515 Commits

Author SHA1 Message Date
Rahul Joshi
1fdf02ad5a
[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002)
Add per-property has<Prop>/set<Prop>/reset<Prop> functions to
MachineFunctionProperties.
2025-05-22 08:07:52 -07:00
Philip Reames
2bb2f8ab49
[CodeGen] Remove experimental deferred spilling from GreedyRegAlloc (#137850)
This experimental option was introduced in 2015 via commit 1192294, and
the target hook was added in 2020 via commit 99e865b6. There does not
appear to have ever been a use of this target hook in tree.

This code is complicating one of the most complicated and hard to
understand parts of our code base, and was an experiment introduced
nearly 10 years ago. Let's get rid of it.

Note that the idea described in the original patch is not neccessarily a
bad one, and we might return to it someday.
2025-05-01 08:11:51 -07:00
weiguozhi
b25b51eb63
[InlineSpiller] Check rematerialization before folding operand (#134015)
Current implementation tries to fold the operand before
rematerialization because it can reduce one register usage. But if there
is a physical register available we can still rematerialize it without
causing high register pressure.

This patch do this check to find the better choice. Then we can produce

    xorps %xmm1, %xmm1
    ucomiss %xmm1, %xmm0

instead of 

    ucomiss LCPI0_1(%rip), %xmm0
2025-04-28 09:52:03 -07:00
Abhishek Kaushik
5543d9ded7
[RegAlloc][NFC] Use std::move to avoid copy (#134533) 2025-04-10 14:45:02 +05:30
Kazu Hirata
e3a3f78f35
[CodeGen] Use llvm::append_range (NFC) (#133603) 2025-03-29 16:53:02 -07:00
Michael Maitland
00fabd21bc
[RISCV][RegAlloc] Add getCSRFirstUseCost for RISC-V (#131349)
This is based off of 63efd8e7e68bc.

The following table shows the percent change to the dynamic instruction
count when the function in this patch returns 0 (default) versus other
values.

| benchmark | % speedup 1 over 0 | % speedup 4 over 0 | % speedup 16
over 0 | % speedup 64 over 0 | % speedup 128 over 0 |
| --------------- | ---------------------- | --------------------- |
--------------------- | -------------------- | -------------------- |
| 500.perlbench_r | 0.001018570165 | 0.001049508358 | 0.001001106529 |
0.03382582818 | 0.03395354577 |
| 502.gcc_r | 0.02850551412 | 0.02170512371 | 0.01453021263 |
0.06011008637 | 0.1215691521 |
| 505.mcf_r | -0.00009506373338 | -0.00009090057642 | -0.0000860991497 |
-0.00005027849766 | 0.00001251173791 |
| 520.omnetpp_r | 0.2958940288 | 0.2959715925 | 0.2961141505 |
0.2959823497 | 0.2963124341 |
| 523.xalancbmk_r | -0.0327074721 | -0.01037021046 | -0.3226810542 |
0.02127133714 | 0.02765388389 |
| 525.x264_r | 0.0000001381714403 | -0.00000007041540345 |
-0.00000002156399465 | 0.0000002108993364 | 0.0000002463382874 |
| 531.deepsjeng_r | 0.00000000339777238 | 0.000000003874652714 |
0.000000003636212547 | 0.000000003874652714 | 0.000000003159332213 |
| 541.leela_r | 0.0009186059953 | -0.000424159199 | 0.0004984456879 |
0.274948447 | 0.8135521414 |
| 557.xz_r | -0.000000003547118854 | -0.00004896449559 |
-0.00004910691576 | -0.0000491109983 | -0.00004895599589 |
| geomean | 0.03265937388 | 0.03424232324 | -0.00107917442 |
0.07629116165 | 0.1439913192 |

The following table shows the percent change to the runtime when the
function in this patch returns 0 (default) versus other values.

| benchmark | % speedup 1 over 0 | % speedup 4 over 0 | % speedup 16
over 0 | % speedup 64 over 0 | %speedup 128 over 0 |
| --------------- | ------------------ | ------------------ |
------------------- | ------------------- | ------------------- |
| 500.perlbench_r | 0.1722356761 | 0.2269681109 | 0.2596825578 |
0.361573851 | 1.15041305 |
| 502.gcc_r | -0.548415855 | -0.06187002799 | -0.5553684674 |
-0.8876686237 | -0.4668665535 |
| 505.mcf_r | -0.8786414258 | -0.4150938441 | -1.035517726 |
-0.1860770377 | -0.01904825648 |
| 520.omnetpp_r | 0.4130256072 | 0.6595976188 | 0.897332171 |
0.6252625622 | 0.3869467278 |
| 523.xalancbmk_r | 1.318132014 | -0.003927574 | 1.025962975 |
1.090320253 | -0.789206202 |
| 525.x264_r | -0.03112871796 | -0.00167557587 | 0.06932423155 |
-0.1919840015 | -0.1203585732 |
| 531.deepsjeng_r | -0.259516072 | -0.01973455652 | -0.2723227894 |
-0.005417022257 | -0.02222388177 |
| 541.leela_r | -0.3497178495 | -0.3510447393 | 0.1274508001 |
0.6485542452 | 0.2880651727 |
| 557.xz_r | 0.7683565263 | -0.2197509447 | -0.0431183874 |
0.07518130872 | 0.5236853039 |
| geomean | 0.06506952742 | -0.0211865386 | 0.05072694648 | 0.1684530637
| 0.1020533557 |

I chose to set the value to 5 on RISC-V because it has improvement to
both the dynamic IC and the runtime and because it showed good results
empirically and had a similar effect as setting it to higher numbers.

I looked at some diff and it seems like this patch leads to two things:
1. Less spilling -- not spilling the CSR led to better register
allocation and helped us avoid spills down the line
2. Avoid spilling CSR but spill more on paths that static heuristics
estimate as cold.
2025-03-20 15:20:04 -04:00
Craig Topper
13cce8c0bc [CodeGen] Use Register::id() to avoid implicit cast. NFC 2025-03-02 22:33:26 -08:00
Matt Arsenault
1a114fa302
RegAlloc: Use new approach to handling failed allocations (#128469)
This fixes an assert after allocation failure.

Rather than collecting failed virtual registers and hacking
on the uses after the fact, directly hack on the uses and rewrite
the registers to the dummy assignment immediately.

Previously we were bypassing LiveRegMatrix and directly assigning
in the VirtRegMap. This resulted in inconsistencies where illegal
overlapping assignments were missing. Rather than try to hack in
some system to manage these in LiveRegMatrix (i.e. hacking around
cases with invalid iterators), avoid this by directly using the
physreg. This should also allow removal of special casing in
virtregrewriter for failed allocations.
2025-02-26 15:34:47 +07:00
Matt Arsenault
e160c35c9e
Reapply "RegAlloc: Fix verifier error after failed allocation (#119690)" (#128400)
Reapply "RegAlloc: Fix verifier error after failed allocation (#119690)"

This reverts commit 0c50054820799578be8f62b6fd2cc3fbc751c01e.

Reapply with more fixes to avoid expensive_checks failures. Make sure to
call splitSeparateComponents after shrinkToUses, and update the VirtRegMap
with the split registers. Also set undef on all physical register aliases to
the assigned register.

Move physreg handling. Not sure if necessary

Remove intervals from regunits. Not sure if necessary
2025-02-26 15:31:48 +07:00
Akshat Oke
fe13cb985c
[CodeGen][NewPM] Port RegAllocGreedy to NPM (#119540)
Leaving out NPM command line support for the next patch.
2025-02-26 12:11:22 +05:30
Matt Arsenault
b66ec64b5b
RegAllocGreedy: Remove unnecessary null register class check (#128487) 2025-02-24 22:56:54 +07:00
Matt Arsenault
3532651b6f RegAllocGreedy: Add braces 2025-02-24 17:08:42 +07:00
Craig Topper
228dbd254a [RegAllocGreedy] Use MCRegister instead of Register for functions that return a physical register.
The callers of these functions return the value as an MCRegister
so this removes some casts from Register to MCRegister.
2025-02-22 21:39:25 -08:00
Craig Topper
0bd66c4194 [RegAllocGreedy] Remove unnecessary conversion from MCRegister to Register. NFC 2025-02-22 16:20:19 -08:00
Craig Topper
6fe780ce63 [RegAllocGreedy] Use Register() instead of 0 for invalid Register. NFC 2025-02-22 16:20:19 -08:00
Matt Arsenault
0c50054820 Revert "RegAlloc: Fix verifier error after failed allocation (#119690)"
This reverts commit 34167f99668ce4d4d6a1fb88453a8d5b56d16ed5.

Different set of verifier errors appears after other regalloc failure
tests with EXPENSIVE_CHECKS.
2025-02-22 00:23:21 +07:00
Matt Arsenault
34167f9966
RegAlloc: Fix verifier error after failed allocation (#119690)
In some cases after reporting an allocation failure, this would fail
the verifier. It picks the first allocatable register and assigns it,
but didn't update the liveness appropriately. When VirtRegRewriter
relied on the liveness to set kill flags, it would incorrectly add
kill flags if there was another overlapping kill of the virtual
register.

We can't properly assign the register to an overlapping range, so
break the liveness of the failing register (and any other interfering
registers) instead. Give the virtual register dummy liveness by
effectively deleting all the uses by setting them to undef.

The edge case not tested here which I'm worried about is if the read
of the register is a def of a subregister. I've been unable to come up
with a test where this occurs.

https://reviews.llvm.org/D122616
2025-02-21 22:11:51 +07:00
Akshat Oke
557628dbe6
[CodeGen][NewPM] Port RegAllocPriorityAdvisor analysis to NPM (#118462)
Similar to #117309.

The advisor and logger are accessed through the provider, which is
served by the new PM. Legacy PM forwards calls to the provider.
New PM is a machine function analysis that lazily initializes the
provider.
2025-02-20 09:35:49 +05:30
Akshat Oke
519b53e65e
[CodeGen][NewPM] Port RegAllocEvictionAdvisor analysis to NPM (#117309)
Legacy pass used to provide the advisor, so this extracts that logic
into a provider class used by both analysis passes.

All three (Default, Release, Development) legacy passes
`*AdvisorAnalysis` are basically renamed to `*AdvisorProvider`, so the
actual legacy wrapper passes are `*AdvisorAnalysisLegacy`.

There is only one NPM analysis `RegAllocEvictionAnalysis` that switches
between the three providers in the `::run` method, to be cached by the
NPM.

Also adds `RequireAnalysis<RegAllocEvictionAnalysis>` to the optimized
target reg alloc codegen builder.
2025-02-18 18:55:06 +07:00
Matt Arsenault
43780f4f92 RegAllocGreedy: Use Register type 2025-02-13 20:49:27 +07:00
Akshat Oke
7b60e03d73
Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126684)
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.

Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.

---------

Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
2025-02-12 18:54:39 +05:30
Akshat Oke
564b9b7f4d
Revert "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126268)
This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I
investigate what's causing the compile-time regression.
2025-02-08 15:36:48 +05:30
Christudasan Devadasan
5aa4979c47
CodeGen][NewPM] Port MachineScheduler to NPM. (#125703) 2025-02-05 12:17:59 +05:30
Akshat Oke
fe9a97ca38
[CodeGen][NewPM] Port RegisterCoalescer to NPM (#124698) 2025-02-03 13:41:51 +07:00
Craig Topper
b7eee2c3fe [CodeGen] Remove some implict conversions of MCRegister to unsigned by using(). NFC
Many of these are indexing BitVectors or something where we can't
using MCRegister and need the register number.
2025-01-19 13:18:04 -08:00
Akshat Oke
4f96fb5fb3
Reapply "Spiller: Detach legacy pass and supply analyses instead (#119181)" (#122665)
Makes Inline Spiller amenable to the new PM.

This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted
because of two unused private members reported on sanitizer bots.
2025-01-13 14:14:13 +05:30
Akshat Oke
089555095b
Revert "Spiller: Detach legacy pass and supply analyses instead (#119… (#122426)
…181)"

This reverts commit a531800344dc54e9c197a13b22e013f919f3f5e1.
2025-01-10 12:23:07 +05:30
Akshat Oke
a531800344
Spiller: Detach legacy pass and supply analyses instead (#119181)
Makes Inline Spiller amenable to the new PM.
2025-01-10 11:46:56 +05:30
Ryan Mansfield
67efbd0bf1
[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955) 2025-01-08 11:07:23 +01:00
Matt Arsenault
93220e7e06
RegAllocGreedy: Fix use after free during last chance recoloring (#120697)
Last chance recoloring can delete the current fixed interval
during recursive assignment of interfering live intervals. Check
if the virtual register value was assigned before attempting the
unassignment, as is done in other scenarios. This relies on the fact
that we do not recycle virtual register numbers.

I have only seen this occur in error situations where the allocation
will fail, but I think this can theoretically happen in working
allocations.

This feels very brute force, but I've spent over a week debugging
this and this is what works without any lit regressions. The surprising
piece to me was that unspillable live ranges may be spilled, and
a number of tests rely on optimizations occurring on them. My other
attempts to fixed this mostly revolved around not identifying unspillable
live ranges as snippet copies. I've also discovered we're making some
unproductive live range splits with subranges. If we avoid such splits,
some of the unspillable copies disappear but mandating that be precise
to fix a use after free doesn't sound right.
2025-01-06 23:12:55 +07:00
Matt Arsenault
11e482c4a3
RegAllocGreedy: Add dummy priority advisor for writing MIR tests (#121207)
I regularly struggle reproducing failures in greedy due to changes
in priority when resuming the allocation from MIR vs. a complete
compilation starting at IR. That is, the fix in
e0919b189bf2df4f97f22ba40260ab5153988b14 did not really fix the
problem of the instruction distance mattering.

Add a way to bypass all of the priority heuristics for MIR tests,
by prioritizing only by virtual register number. Could also
give this a more specific name, like PrioritizeLowVirtRegNumber
2025-01-02 23:04:44 +07:00
Matt Arsenault
e2cabd715b RegAllocGreedy: Fix comment typo 2024-12-17 12:46:04 +07:00
Matt Arsenault
818bffcb1c
RegAlloc: Fix failure on undef use when all registers are reserved (#119647)
Greedy and fast would hit different assertions on undef uses if all
registers in a class were reserved.
2024-12-16 10:56:45 +09:00
Akshat Oke
2c7ece2e8c
[CodeGen][NewPM] Port LiveStacks analysis to NPM (#118778) 2024-12-06 15:16:07 +05:30
Akshat Oke
d9b4bdbff5
[CodeGen][NewPM] Port LiveDebugVariables to NPM (#115468)
The existing analysis was already a pimpl wrapper.

I have extracted legacy pass logic to a LDVImpl wrapper named
`LiveDebugVariables` which is the analysis::Result now. This controls
whether to activate the LDV (depending on `-live-debug-variables` and
DIsubprogram) itself.

The legacy and new analysis only construct the LiveDebugVariables.

VirtRegRewriter will test this.
2024-12-04 14:31:34 +05:30
Akshat Oke
b68340c835
[CodeGen][NewPM] Port SpillPlacement analysis to NPM (#116618) 2024-11-29 16:55:40 +05:30
Akshat Oke
cac13606c2
[CodeGen][NewPM] Port EdgeBundles analysis to NPM (#116616) 2024-11-22 16:51:50 +05:30
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Akshat Oke
4e32d7236b
[NewPM][CodeGen] Port LiveRegMatrix to NPM (#109938) 2024-10-22 15:28:04 +05:30
Akshat Oke
93802815ab
[NewPM][CodeGen] Port VirtRegMap to NPM (#109936) 2024-10-22 15:15:56 +05:30
Bevin Hansson
1a65d95d00
[CodeGen][RAGreedy] Inform LiveDebugVariables about snippets spilled by InlineSpiller. (#109962)
RAGreedy invokes InlineSpiller to spill a particular virtreg inline.
When the spiller does this, it also identifies small, adjacent liveranges called
snippets. These are also spilled or rematerialized in the process.

However, the spiller does not inform RA that it has spilled these regs.
This means that debug variable locations referencing these regs/ranges
are lost.

Mark any spilled regs which do not have a stack slot assigned to them as
allocated to the slot being spilled to to tell LDV that those regs are
located in that slot, even though the regs might no longer exist in the
program after regalloc is finished. Also, inform RA about all of the
regs which were replaced (spilled or rematted), not just the one that was
requested so that it can properly manage the ranges of the debug vars.
2024-10-02 10:29:56 +02:00
Matt Arsenault
71ca9fcb8d
llvm-reduce: Don't print verifier failed machine functions (#109673)
This produces far too much terminal output, particularly for the
instruction reduction. Since it doesn't consider the liveness of of
the instructions it's deleting, it produces quite a lot of verifier
errors.
2024-09-24 22:32:53 +04:00
Christudasan Devadasan
15b41d207e
[CodeGen] change prototype of regalloc filter function (#93525)
[CodeGen] Change the prototype of regalloc filter function

Change the prototype of the filter function so that we can
filter not just by RegClass. We need to implement more
complicated filter based upon some other info associated
with each register.

Patch provided by: Gang Chen (gangc@amd.com)
2024-07-22 16:49:39 +05:30
Kazu Hirata
66cd2e0f9a
[CodeGen] Use range-based for loops (NFC) (#98706) 2024-07-13 13:29:47 -07:00
paperchalice
099899961c
[CodeGen][NewPM] Port machine-block-freq to new pass manager (#98317)
- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager.
- `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new
pass manager migration.
2024-07-12 15:45:01 +08:00
paperchalice
abde52aa66
[CodeGen][NewPM] Port LiveIntervals to new pass manager (#98118)
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.

This would be the last analysis required by `PHIElimination`.
2024-07-10 19:34:48 +08:00
paperchalice
4010f894a1
[CodeGen][NewPM] Port SlotIndexes to new pass manager (#97941)
- Add `SlotIndexesAnalysis`.
- Add `SlotIndexesPrinterPass`.
- Use `SlotIndexesWrapperPass` in legacy pass.
2024-07-09 12:09:11 +08:00
paperchalice
79d0de2ac3
[CodeGen][NewPM] Port machine-loops to new pass manager (#97793)
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-07-09 09:11:18 +08:00
Alexis Engelke
739a960567
[RegAlloc] Don't call always-true ShouldAllocClass (#96296)
Previously, there was at least one virtual function call for every
allocated register. The only users of this feature are AMDGPU and RISC-V
(RVV), other targets don't use this. To easily identify these cases,
change the default functor to nullptr and don't call it for every
allocated register.
2024-06-21 13:18:35 +02:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00