llvm-project

Author	SHA1	Message	Date
Abhishek Kaushik	5543d9ded7	[RegAlloc][NFC] Use `std::move` to avoid copy (#134533 )	2025-04-10 14:45:02 +05:30
Kazu Hirata	e3a3f78f35	[CodeGen] Use llvm::append_range (NFC) (#133603 )	2025-03-29 16:53:02 -07:00
Michael Maitland	00fabd21bc	[RISCV][RegAlloc] Add getCSRFirstUseCost for RISC-V (#131349 ) This is based off of 63efd8e7e68bc. The following table shows the percent change to the dynamic instruction count when the function in this patch returns 0 (default) versus other values. \| benchmark \| % speedup 1 over 0 \| % speedup 4 over 0 \| % speedup 16 over 0 \| % speedup 64 over 0 \| % speedup 128 over 0 \| \| --------------- \| ---------------------- \| --------------------- \| --------------------- \| -------------------- \| -------------------- \| \| 500.perlbench_r \| 0.001018570165 \| 0.001049508358 \| 0.001001106529 \| 0.03382582818 \| 0.03395354577 \| \| 502.gcc_r \| 0.02850551412 \| 0.02170512371 \| 0.01453021263 \| 0.06011008637 \| 0.1215691521 \| \| 505.mcf_r \| -0.00009506373338 \| -0.00009090057642 \| -0.0000860991497 \| -0.00005027849766 \| 0.00001251173791 \| \| 520.omnetpp_r \| 0.2958940288 \| 0.2959715925 \| 0.2961141505 \| 0.2959823497 \| 0.2963124341 \| \| 523.xalancbmk_r \| -0.0327074721 \| -0.01037021046 \| -0.3226810542 \| 0.02127133714 \| 0.02765388389 \| \| 525.x264_r \| 0.0000001381714403 \| -0.00000007041540345 \| -0.00000002156399465 \| 0.0000002108993364 \| 0.0000002463382874 \| \| 531.deepsjeng_r \| 0.00000000339777238 \| 0.000000003874652714 \| 0.000000003636212547 \| 0.000000003874652714 \| 0.000000003159332213 \| \| 541.leela_r \| 0.0009186059953 \| -0.000424159199 \| 0.0004984456879 \| 0.274948447 \| 0.8135521414 \| \| 557.xz_r \| -0.000000003547118854 \| -0.00004896449559 \| -0.00004910691576 \| -0.0000491109983 \| -0.00004895599589 \| \| geomean \| 0.03265937388 \| 0.03424232324 \| -0.00107917442 \| 0.07629116165 \| 0.1439913192 \| The following table shows the percent change to the runtime when the function in this patch returns 0 (default) versus other values. \| benchmark \| % speedup 1 over 0 \| % speedup 4 over 0 \| % speedup 16 over 0 \| % speedup 64 over 0 \| %speedup 128 over 0 \| \| --------------- \| ------------------ \| ------------------ \| ------------------- \| ------------------- \| ------------------- \| \| 500.perlbench_r \| 0.1722356761 \| 0.2269681109 \| 0.2596825578 \| 0.361573851 \| 1.15041305 \| \| 502.gcc_r \| -0.548415855 \| -0.06187002799 \| -0.5553684674 \| -0.8876686237 \| -0.4668665535 \| \| 505.mcf_r \| -0.8786414258 \| -0.4150938441 \| -1.035517726 \| -0.1860770377 \| -0.01904825648 \| \| 520.omnetpp_r \| 0.4130256072 \| 0.6595976188 \| 0.897332171 \| 0.6252625622 \| 0.3869467278 \| \| 523.xalancbmk_r \| 1.318132014 \| -0.003927574 \| 1.025962975 \| 1.090320253 \| -0.789206202 \| \| 525.x264_r \| -0.03112871796 \| -0.00167557587 \| 0.06932423155 \| -0.1919840015 \| -0.1203585732 \| \| 531.deepsjeng_r \| -0.259516072 \| -0.01973455652 \| -0.2723227894 \| -0.005417022257 \| -0.02222388177 \| \| 541.leela_r \| -0.3497178495 \| -0.3510447393 \| 0.1274508001 \| 0.6485542452 \| 0.2880651727 \| \| 557.xz_r \| 0.7683565263 \| -0.2197509447 \| -0.0431183874 \| 0.07518130872 \| 0.5236853039 \| \| geomean \| 0.06506952742 \| -0.0211865386 \| 0.05072694648 \| 0.1684530637 \| 0.1020533557 \| I chose to set the value to 5 on RISC-V because it has improvement to both the dynamic IC and the runtime and because it showed good results empirically and had a similar effect as setting it to higher numbers. I looked at some diff and it seems like this patch leads to two things: 1. Less spilling -- not spilling the CSR led to better register allocation and helped us avoid spills down the line 2. Avoid spilling CSR but spill more on paths that static heuristics estimate as cold.	2025-03-20 15:20:04 -04:00
Craig Topper	13cce8c0bc	[CodeGen] Use Register::id() to avoid implicit cast. NFC	2025-03-02 22:33:26 -08:00
Matt Arsenault	1a114fa302	RegAlloc: Use new approach to handling failed allocations (#128469 ) This fixes an assert after allocation failure. Rather than collecting failed virtual registers and hacking on the uses after the fact, directly hack on the uses and rewrite the registers to the dummy assignment immediately. Previously we were bypassing LiveRegMatrix and directly assigning in the VirtRegMap. This resulted in inconsistencies where illegal overlapping assignments were missing. Rather than try to hack in some system to manage these in LiveRegMatrix (i.e. hacking around cases with invalid iterators), avoid this by directly using the physreg. This should also allow removal of special casing in virtregrewriter for failed allocations.	2025-02-26 15:34:47 +07:00
Matt Arsenault	e160c35c9e	Reapply "RegAlloc: Fix verifier error after failed allocation (#119690 )" (#128400 ) Reapply "RegAlloc: Fix verifier error after failed allocation (#119690)" This reverts commit 0c50054820799578be8f62b6fd2cc3fbc751c01e. Reapply with more fixes to avoid expensive_checks failures. Make sure to call splitSeparateComponents after shrinkToUses, and update the VirtRegMap with the split registers. Also set undef on all physical register aliases to the assigned register. Move physreg handling. Not sure if necessary Remove intervals from regunits. Not sure if necessary	2025-02-26 15:31:48 +07:00
Akshat Oke	fe13cb985c	[CodeGen][NewPM] Port RegAllocGreedy to NPM (#119540 ) Leaving out NPM command line support for the next patch.	2025-02-26 12:11:22 +05:30
Matt Arsenault	b66ec64b5b	RegAllocGreedy: Remove unnecessary null register class check (#128487 )	2025-02-24 22:56:54 +07:00
Matt Arsenault	3532651b6f	RegAllocGreedy: Add braces	2025-02-24 17:08:42 +07:00
Craig Topper	228dbd254a	[RegAllocGreedy] Use MCRegister instead of Register for functions that return a physical register. The callers of these functions return the value as an MCRegister so this removes some casts from Register to MCRegister.	2025-02-22 21:39:25 -08:00
Craig Topper	0bd66c4194	[RegAllocGreedy] Remove unnecessary conversion from MCRegister to Register. NFC	2025-02-22 16:20:19 -08:00
Craig Topper	6fe780ce63	[RegAllocGreedy] Use Register() instead of 0 for invalid Register. NFC	2025-02-22 16:20:19 -08:00
Matt Arsenault	0c50054820	Revert "RegAlloc: Fix verifier error after failed allocation (#119690 )" This reverts commit 34167f99668ce4d4d6a1fb88453a8d5b56d16ed5. Different set of verifier errors appears after other regalloc failure tests with EXPENSIVE_CHECKS.	2025-02-22 00:23:21 +07:00
Matt Arsenault	34167f9966	RegAlloc: Fix verifier error after failed allocation (#119690 ) In some cases after reporting an allocation failure, this would fail the verifier. It picks the first allocatable register and assigns it, but didn't update the liveness appropriately. When VirtRegRewriter relied on the liveness to set kill flags, it would incorrectly add kill flags if there was another overlapping kill of the virtual register. We can't properly assign the register to an overlapping range, so break the liveness of the failing register (and any other interfering registers) instead. Give the virtual register dummy liveness by effectively deleting all the uses by setting them to undef. The edge case not tested here which I'm worried about is if the read of the register is a def of a subregister. I've been unable to come up with a test where this occurs. https://reviews.llvm.org/D122616	2025-02-21 22:11:51 +07:00
Akshat Oke	557628dbe6	[CodeGen][NewPM] Port RegAllocPriorityAdvisor analysis to NPM (#118462 ) Similar to #117309. The advisor and logger are accessed through the provider, which is served by the new PM. Legacy PM forwards calls to the provider. New PM is a machine function analysis that lazily initializes the provider.	2025-02-20 09:35:49 +05:30
Akshat Oke	519b53e65e	[CodeGen][NewPM] Port RegAllocEvictionAdvisor analysis to NPM (#117309 ) Legacy pass used to provide the advisor, so this extracts that logic into a provider class used by both analysis passes. All three (Default, Release, Development) legacy passes `AdvisorAnalysis` are basically renamed to `AdvisorProvider`, so the actual legacy wrapper passes are `*AdvisorAnalysisLegacy`. There is only one NPM analysis `RegAllocEvictionAnalysis` that switches between the three providers in the `::run` method, to be cached by the NPM. Also adds `RequireAnalysis<RegAllocEvictionAnalysis>` to the optimized target reg alloc codegen builder.	2025-02-18 18:55:06 +07:00
Matt Arsenault	43780f4f92	RegAllocGreedy: Use Register type	2025-02-13 20:49:27 +07:00
Akshat Oke	7b60e03d73	Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703 )" (#126684 ) `RegisterClassInfo` was supposed to be kept alive between pass runs, which wasn't being done leading to recomputations increasing the compile time. Now the Impl class is a member of the legacy and new passes so that it is not reconstructed on every pass run. --------- Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>	2025-02-12 18:54:39 +05:30
Akshat Oke	564b9b7f4d	Revert "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703 )" (#126268 ) This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I investigate what's causing the compile-time regression.	2025-02-08 15:36:48 +05:30
Christudasan Devadasan	5aa4979c47	CodeGen][NewPM] Port MachineScheduler to NPM. (#125703 )	2025-02-05 12:17:59 +05:30
Akshat Oke	fe9a97ca38	[CodeGen][NewPM] Port RegisterCoalescer to NPM (#124698 )	2025-02-03 13:41:51 +07:00
Craig Topper	b7eee2c3fe	[CodeGen] Remove some implict conversions of MCRegister to unsigned by using(). NFC Many of these are indexing BitVectors or something where we can't using MCRegister and need the register number.	2025-01-19 13:18:04 -08:00
Akshat Oke	4f96fb5fb3	Reapply "Spiller: Detach legacy pass and supply analyses instead (#119181 )" (#122665 ) Makes Inline Spiller amenable to the new PM. This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted because of two unused private members reported on sanitizer bots.	2025-01-13 14:14:13 +05:30
Akshat Oke	089555095b	Revert "Spiller: Detach legacy pass and supply analyses instead (#119… (#122426 ) …181)" This reverts commit a531800344dc54e9c197a13b22e013f919f3f5e1.	2025-01-10 12:23:07 +05:30
Akshat Oke	a531800344	Spiller: Detach legacy pass and supply analyses instead (#119181 ) Makes Inline Spiller amenable to the new PM.	2025-01-10 11:46:56 +05:30
Ryan Mansfield	67efbd0bf1	[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955 )	2025-01-08 11:07:23 +01:00
Matt Arsenault	93220e7e06	RegAllocGreedy: Fix use after free during last chance recoloring (#120697 ) Last chance recoloring can delete the current fixed interval during recursive assignment of interfering live intervals. Check if the virtual register value was assigned before attempting the unassignment, as is done in other scenarios. This relies on the fact that we do not recycle virtual register numbers. I have only seen this occur in error situations where the allocation will fail, but I think this can theoretically happen in working allocations. This feels very brute force, but I've spent over a week debugging this and this is what works without any lit regressions. The surprising piece to me was that unspillable live ranges may be spilled, and a number of tests rely on optimizations occurring on them. My other attempts to fixed this mostly revolved around not identifying unspillable live ranges as snippet copies. I've also discovered we're making some unproductive live range splits with subranges. If we avoid such splits, some of the unspillable copies disappear but mandating that be precise to fix a use after free doesn't sound right.	2025-01-06 23:12:55 +07:00
Matt Arsenault	11e482c4a3	RegAllocGreedy: Add dummy priority advisor for writing MIR tests (#121207 ) I regularly struggle reproducing failures in greedy due to changes in priority when resuming the allocation from MIR vs. a complete compilation starting at IR. That is, the fix in e0919b189bf2df4f97f22ba40260ab5153988b14 did not really fix the problem of the instruction distance mattering. Add a way to bypass all of the priority heuristics for MIR tests, by prioritizing only by virtual register number. Could also give this a more specific name, like PrioritizeLowVirtRegNumber	2025-01-02 23:04:44 +07:00
Matt Arsenault	e2cabd715b	RegAllocGreedy: Fix comment typo	2024-12-17 12:46:04 +07:00
Matt Arsenault	818bffcb1c	RegAlloc: Fix failure on undef use when all registers are reserved (#119647 ) Greedy and fast would hit different assertions on undef uses if all registers in a class were reserved.	2024-12-16 10:56:45 +09:00
Akshat Oke	2c7ece2e8c	[CodeGen][NewPM] Port LiveStacks analysis to NPM (#118778 )	2024-12-06 15:16:07 +05:30
Akshat Oke	d9b4bdbff5	[CodeGen][NewPM] Port LiveDebugVariables to NPM (#115468 ) The existing analysis was already a pimpl wrapper. I have extracted legacy pass logic to a LDVImpl wrapper named `LiveDebugVariables` which is the analysis::Result now. This controls whether to activate the LDV (depending on `-live-debug-variables` and DIsubprogram) itself. The legacy and new analysis only construct the LiveDebugVariables. VirtRegRewriter will test this.	2024-12-04 14:31:34 +05:30
Akshat Oke	b68340c835	[CodeGen][NewPM] Port SpillPlacement analysis to NPM (#116618 )	2024-11-29 16:55:40 +05:30
Akshat Oke	cac13606c2	[CodeGen][NewPM] Port EdgeBundles analysis to NPM (#116616 )	2024-11-22 16:51:50 +05:30
Kazu Hirata	735ab61ac8	[CodeGen] Remove unused includes (NFC) (#115996 ) Identified with misc-include-cleaner.	2024-11-12 23:15:06 -08:00
Akshat Oke	4e32d7236b	[NewPM][CodeGen] Port LiveRegMatrix to NPM (#109938 )	2024-10-22 15:28:04 +05:30
Akshat Oke	93802815ab	[NewPM][CodeGen] Port VirtRegMap to NPM (#109936 )	2024-10-22 15:15:56 +05:30
Bevin Hansson	1a65d95d00	[CodeGen][RAGreedy] Inform LiveDebugVariables about snippets spilled by InlineSpiller. (#109962 ) RAGreedy invokes InlineSpiller to spill a particular virtreg inline. When the spiller does this, it also identifies small, adjacent liveranges called snippets. These are also spilled or rematerialized in the process. However, the spiller does not inform RA that it has spilled these regs. This means that debug variable locations referencing these regs/ranges are lost. Mark any spilled regs which do not have a stack slot assigned to them as allocated to the slot being spilled to to tell LDV that those regs are located in that slot, even though the regs might no longer exist in the program after regalloc is finished. Also, inform RA about all of the regs which were replaced (spilled or rematted), not just the one that was requested so that it can properly manage the ranges of the debug vars.	2024-10-02 10:29:56 +02:00
Matt Arsenault	71ca9fcb8d	llvm-reduce: Don't print verifier failed machine functions (#109673 ) This produces far too much terminal output, particularly for the instruction reduction. Since it doesn't consider the liveness of of the instructions it's deleting, it produces quite a lot of verifier errors.	2024-09-24 22:32:53 +04:00
Christudasan Devadasan	15b41d207e	[CodeGen] change prototype of regalloc filter function (#93525 ) [CodeGen] Change the prototype of regalloc filter function Change the prototype of the filter function so that we can filter not just by RegClass. We need to implement more complicated filter based upon some other info associated with each register. Patch provided by: Gang Chen (gangc@amd.com)	2024-07-22 16:49:39 +05:30
Kazu Hirata	66cd2e0f9a	[CodeGen] Use range-based for loops (NFC) (#98706 )	2024-07-13 13:29:47 -07:00
paperchalice	099899961c	[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317 ) - Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.	2024-07-12 15:45:01 +08:00
paperchalice	abde52aa66	[CodeGen][NewPM] Port `LiveIntervals` to new pass manager (#98118 ) - Add `LiveIntervalsAnalysis`. - Add `LiveIntervalsPrinterPass`. - Use `LiveIntervalsWrapperPass` in legacy pass manager. - Use `std::unique_ptr` instead of raw pointer for `LICalc`, so destructor and default move constructor can handle it correctly. This would be the last analysis required by `PHIElimination`.	2024-07-10 19:34:48 +08:00
paperchalice	4010f894a1	[CodeGen][NewPM] Port `SlotIndexes` to new pass manager (#97941 ) - Add `SlotIndexesAnalysis`. - Add `SlotIndexesPrinterPass`. - Use `SlotIndexesWrapperPass` in legacy pass.	2024-07-09 12:09:11 +08:00
paperchalice	79d0de2ac3	[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793 ) - Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.	2024-07-09 09:11:18 +08:00
Alexis Engelke	739a960567	[RegAlloc] Don't call always-true ShouldAllocClass (#96296 ) Previously, there was at least one virtual function call for every allocated register. The only users of this feature are AMDGPU and RISC-V (RVV), other targets don't use this. To easily identify these cases, change the default functor to nullptr and don't call it for every allocated register.	2024-06-21 13:18:35 +02:00
paperchalice	837dc542b1	[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571 ) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.	2024-06-11 21:27:14 +08:00
Piyou Chen	32cb3c5508	[NFC][LLVM][CodeGen] Move LiveDebugVariables.h into llvm/include/llvm/CodeGen (#88374 ) This patch make `LiveDebugVariables` can be used by passes outside of `lib/CodeGen`. If we run a pass that occurs between the split register allocation pass without preserving this pass, it will be freed and recomputed until it encounters the next pass that needs LiveDebugVariables. However, `LiveDebugVariables` will raise an assertion due to the pass being freed without emitting a debug value. This is reason we need `LiveDebugVariables` to be available for passes outside of lib/Codegen.	2024-04-15 21:58:57 +08:00
David Green	303a7835ff	[GreedyRA] Improve RA for nested loop induction variables (#72093 ) Imagine a loop of the form: ``` preheader: %r = def header: bcc latch, inner inner1: .. inner2: b latch latch: %r = subs %r bcc header ``` It can be possible for code to spend a decent amount of time in the header<->latch loop, not going into the inner part of the loop as much. The greedy register allocator can prefer to spill _around_ %r though, adding spills around the subs in the loop, which can be very detrimental for performance. (The case I am looking at is actually a very deeply nested set of loops that repeat the header<->latch pattern at multiple different levels). The greedy RA will apply a preference to spill to the IV, as it is live through the header block. This patch attempts to add a heuristic to prevent that in this case for variables that look like IVs, in a similar regard to the extra spill weight that gets added to variables that look like IVs, that are expensive to spill. That will mean spills are more likely to be pushed into the inner blocks, where they are less likely to be executed and not as expensive as spills around the IV. This gives a 8% speedup in the exchange benchmark from spec2017 when compiled with flang-new, whilst importantly stabilising the scores to be less chaotic to other changes. Running ctmark showed no difference in the compile time. I've tried to run a range of benchmarking for performance, most of which were relatively flat not showing many large differences. One matrix multiply case improved 21.3% due to removing a cascading chains of spills, and some other knock-on effects happen which usually cause small differences in the scores.	2023-11-18 09:55:19 +00:00
Kazu Hirata	bafd35ca04	[llvm] Stop including llvm/ADT/SmallPtrSet.h (NFC) Identified with clangd.	2023-11-11 00:35:14 -08:00

1 2 3 4 5 ...

512 Commits