llvm-project

Author	SHA1	Message	Date
Lucas Ramirez	83c308f014	[AMDGPU][Scheduler] Consistent occupancy calculation during rematerialization (#149224 ) The `RPTarget`'s way of determining whether VGPRs are beneficial to save and whether the target has been reached w.r.t. VGPR usage currently assumes, if `CombinedVGPRSavings` is true, that free slots in one VGPR RC can always be used for the other. Implicitly, this makes the rematerialization stage (only current user of `RPTarget`) follow a different occupancy calculation than the "regular one" that the scheduler uses, one that assumes that ArchVGPR/AGPR usage can be balanced perfectly and at no cost, which is untrue in general. This ultimately yields suboptimal rematerialization decisions that require cross-VGPR-RC copies unnecessarily. This fixes that, making the `RPTarget`'s internal model of occupancy consistent with the regular one. The `CombinedVGPRSavings` flag is removed, and a form of cross-VGPR-RC saving implemented only for unified RFs, which is where it makes the most sense. Only when the amount of free VGPRs in a given VGPR RC (ArchVPGR or AGPR) is lower than the excess VGPR usage in the other VGPR RC does the `RPTarget` consider that a pressure reduction in the former will be beneficial to the latter.	2025-08-08 14:26:04 +02:00
Jeffrey Byrnes	c29094df72	[AMDGPU] NFCI: Track AV Register Pressure separately (#149863 ) Adds new entries in the GCNPressure array for AVGPR pressure. In this PR, AVGPR pressure is added to pure VGPR pressure under all the pressure queries, so it is NFC. Separating out this pseudo RC will help us make more informed decisions in future work. This RC can be assigned as either VGPR or AGPR, so tracking them as pure VGPR pressure is not accurate.	2025-07-25 12:11:52 -07:00
Lucas Ramirez	6307b496f8	[AMDGPU] Add `GCNRPTarget` to track register pressure against a target (#145765 ) This adds the `GCNRPTarget` class which models a register pressure target (i.e., maximum number of SGPRs/VGPRS) that one can track register savings against. The only current use of this class is in the scheduler's rematerialization stage. It replaces the more ad-hoc (and now deleted) `ExcessRP` class which used to serve the same purpose. This is only NFC~ish because `GCNRPTarget` tracks VGPR usage more accurately than `ExcessRP` used to. To estimate required combined VGPR savings we now additionally take into account the number of available VGPRs in both banks (ArchVGPR and AGPR) at the time where the RP target is created, whereas we used to only consider explicit savings made from the starting RP. This makes VGPR savings estimations more accurate in cases where we allow for savings in one VGPR bank to help towards reducing pressure in another VGPR bank (see `GCNRPTarget::CombineVGPRSavings`). This is the cause for unit test changes.	2025-06-26 13:11:20 +02:00
Diana Picus	a201f8872a	[AMDGPU] Replace dynamic VGPR feature with attribute (#133444 ) Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget feature, as requested in #130030.	2025-06-24 11:09:36 +02:00
Shilei Tian	edbaf19c46	[AMDGPU] Fix a potential integer overflow in GCNRegPressure when true16 is enabled (#144968 ) Fixes SWDEV-537014.	2025-06-20 12:29:32 -04:00
Lucas Ramirez	1f20cb9829	[AMDGPU] Simplify `GCNRegPressure::RegKind` (NFC) (#142682 ) This NFC simplifies the `GCNRegPressure::RegKind` enum so that instead of containing a pair of values for each type of register (one for non-tuple registers and one for tuple registers of that type) it only contains one value representing all registers of that type. The `GCNRegPressure::Value` array is still sized as before, though all elements corresponding to tuple-kinds now start after all elements corresponding to non-tuple-kinds instead of the two being interleaved. This allows to simplify the `GCNRegPressure::inc` logic, eliminating the switch entirely.	2025-06-04 11:23:57 +02:00
Kazu Hirata	f459cfed7b	[AMDGPU] Avoid repeated hash lookups (NFC) (#132511 )	2025-03-21 22:15:40 -07:00
Craig Topper	9e6494c0fb	[CodeGen] Rename RegisterMaskPair to VRegMaskOrUnit. NFC (#123799 ) This holds a physical register unit or virtual register and mask. While I was here I've used emplace_back and removed an unneeded use of a template.	2025-01-22 09:11:22 -08:00
Jeffrey Byrnes	17bc959961	[AMDGPU] Optionally Use GCNRPTrackers during scheduling (#93090 ) This adds the ability to use the GCNRPTrackers during scheduling. These trackers have several advantages over the generic trackers: 1. global live-thru trackers, 2. subregister based RP deltas, and 3. flexible vreg -> PressureSet mappings. This feature is off-by-default to ease with the roll-out process. In particular, when using the optional trackers, the scheduler will still maintain the generic trackers leading to unnecessary compile time.	2024-10-09 09:54:11 -07:00
Jeffrey Byrnes	5cb6b15568	[AMDGPU] Constrain use LiveMask by the operand's LaneMask for RP calculation. For speculative RP queries, recede may calculate inaccurate masks for subreg uses. Previously, the calculation would look at any live lane for the use at the position of the MI in the LIS. This also adds lanes for any subregs which are live at but not used by the instruction. By constraining against the getSubRegIndexLaneMask for the operand's subreg, we are sure to not pick up on these extra lanes. For current clients of recede, this is not an issue. This is because 1. the current clients do not violate the program order in the LIS, and 2. the change to RP is based on the difference between previous mask and new mask. Since current clients are not exposed to this issue, this patch is sort of NFC. Co-authored-by: Valery Pykhtin Valery.Pykhtin@amd.com Change-Id: Iaed80271226b2587297e6fb78fe081afec1a9275	2024-10-08 10:29:50 -07:00
Jay Foad	8d13e7b8c3	[AMDGPU] Qualify auto. NFC. (#110878 ) Generated automatically with: $ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find lib/Target/AMDGPU/ -type f)	2024-10-03 13:07:54 +01:00
paperchalice	abde52aa66	[CodeGen][NewPM] Port `LiveIntervals` to new pass manager (#98118 ) - Add `LiveIntervalsAnalysis`. - Add `LiveIntervalsPrinterPass`. - Use `LiveIntervalsWrapperPass` in legacy pass manager. - Use `std::unique_ptr` instead of raw pointer for `LICalc`, so destructor and default move constructor can handle it correctly. This would be the last analysis required by `PHIElimination`.	2024-07-10 19:34:48 +08:00
Jeffrey Byrnes	113052b2b0	[AMDGPU] Prefer lower total register usage in regions with spilling Change-Id: Ia5c434b0945bdcbc357c5e06c3164118fc91df25	2024-02-26 12:19:52 -08:00
Valery Pykhtin	901c5be524	[AMDGPU] Fix GCNUpwardRPTracker: max register pressure on defs. (#74422 ) Treat a defined register as fully live "at" the instruction and update maximum pressure accordingly. Fixes #3786.	2023-12-08 11:27:08 +01:00
Valery Pykhtin	57a11b7f75	[AMDGPU] Add live-through register set printing to GCNRegPressurePrinter pass. (#71096 ) Add live-through register set printing, assuming live-through register is in live-in and live-out sets, has no redefinitions but may have uses in the block.	2023-11-20 13:35:47 +01:00
Valery Pykhtin	87b8d94371	[AMDGPU] Fix GCNUpwardRPTracker. (#71186 ) Fixed: 1. Maximum register pressure calculation at the instruction level. Previously max RP included both def and use of registers of an instruction. Now maximum RP includes _uses_ and _early-clobber defs_. 2. Uses were incorrectly tracked and this resulted in a mismatch of live-in set reported by LiveIntervals and tracked live reg set when the beginning of the block is reached. Interface has changed, moveMaxPressure becomes deprecated and getMaxPressure, resetMaxPressure functions are added. reset function seem now more consistent.	2023-11-10 13:44:10 +01:00
Valery Pykhtin	e808f8a616	[AMDGPU] GCNRegPressurePrinter pass to print GCNRegPressure values for testing. (#70031 ) Using GCNDownwardRPTracker or GCNUpwardRPTracker the pass collects register pressure values for a function and prints these values next to instructions. Output can be used to generate Filecheck rules in mir tests.	2023-11-01 23:01:39 +01:00
Jay Foad	fcbdcb13ce	[AMDGPU] Tweak tuple weight calculation. NFC. (#66490 ) This just makes it more obvious that GCNRegPressure does not actually use pressure sets.	2023-09-15 16:30:06 +01:00
Jay Foad	3030c03988	[AMDGPU] Make use of MachineInstr::all_defs and all_uses. NFCI.	2023-06-05 10:32:33 +01:00
Valery Pykhtin	8f6c47b7a4	[AMDGPU] Speedup GCNDownwardRPTracker::advanceBeforeNext The function makes liveness tests for the entire live register set for every instruction it passes by. This becomes very slow on high RP regions such as ASAN enabled code. Instead only uses of last tracked instruction should be tested and this greatly improves compilation time. This patch revealed few bugs in SIFormMemoryClauses and PreRARematStage::sinkTriviallyRematInsts which should be fixed first. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136267	2023-03-09 15:18:02 +01:00
Jay Foad	0affe0c8cf	Revert "[AMDGPU] Speedup GCNDownwardRPTracker::advanceBeforeNext" This reverts commit 2d09bec169277fb5a341249afacff532c7511756. It was causing assertion failures in some out-of-tree tests.	2022-12-02 14:18:16 +00:00
Valery Pykhtin	2d09bec169	[AMDGPU] Speedup GCNDownwardRPTracker::advanceBeforeNext The function makes liveness tests for the entire live register set for every instruction it passes by. This becomes very slow on high RP regions such as ASAN enabled code. Instead only uses of last tracked instruction should be tested and this greatly improves compilation time. This patch revealed few bugs in SIFormMemoryClauses and PreRARematStage::sinkTriviallyRematInsts which should be fixed first. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136267	2022-12-02 09:05:22 +01:00
Valery Pykhtin	5144133f6f	[AMDGPU] Fix GCNDownwardRPTracker::advanceBeforeNext at the end of MBB The problem with GCNDownwardRPTracker::advanceBeforeNext is that it doesn't allow to get register pressure after the last instruction in a MBB. However when we track RP through the boundary of a MBB we need the state that is after the last instruction of the MBB and before the first instruction of the successor MBB. Currently we stop traking RP in the state 'at' the last instruction of the MBB which is incorrect. This patch fixes 27 lit tests with EXPENSIVE_CHECKS enabled. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D136927	2022-11-03 11:52:56 +01:00
Valery Pykhtin	8d7f88416a	Revert "[AMDGPU] Add EXPENSIVE_CHECK into GCNRPTracker::reset" This reverts commit fecf067db40ffa1a6d5d665769c90cd29547f502. The change introduces 420 test failures with EXPENSIVE_CHECK in AMDGPU which I don't want to disable. Going to fix the failures and recommit the check.	2022-10-28 09:15:37 +02:00
Valery Pykhtin	fecf067db4	[AMDGPU] Add EXPENSIVE_CHECK into GCNRPTracker::reset This would check if passed in live-ins registers match those calculated using LIS. This check currently breaks 420 lit tests when enabled. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136860	2022-10-28 08:42:21 +02:00
Valery Pykhtin	4ae88a8d42	[AMDGPU] Refactor debug printing routines for GCNRPTracker Use Printable to enhance syntax, remove duplication, unify. Reviewed By: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D136704	2022-10-28 04:22:46 +02:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
Christudasan Devadasan	654c89d85a	[AMDGPU] Make vector superclasses allocatable The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc. Also, added the missing AV register classes from 192b to 1024b. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D109300	2021-11-26 00:42:12 -05:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
Matt Arsenault	41877b82f0	AMDGPU: Fix dbg_value handling when forming soft clause bundles DBG_VALUES placed between memory instructions would change codegen. Skip over these and re-insert them after the bundle instead of giving up on bundling.	2021-02-01 22:16:35 -05:00
dfukalov	560d7e0411	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036	2021-01-20 22:22:45 +03:00
Kazu Hirata	b934160aaa	[Target] Use llvm::find_if (NFC)	2021-01-07 20:29:36 -08:00
dfukalov	6a87e9b08b	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813	2021-01-07 22:22:05 +03:00
Jay Foad	3497860203	[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister ... in favour of the isPhysical/isVirtual methods.	2020-08-20 17:59:11 +01:00
Stanislav Mekhanoshin	ada205e91e	[AMDGPU] Fix assumption about LaneBitmask content Yet another assumption about an actual LaneBitmask content is fixed. Differential Revision: https://reviews.llvm.org/D74805	2020-02-19 09:07:11 -08:00
Stanislav Mekhanoshin	cacc3b7a55	[AMDGPU] Cleanup assumptions about generated subregs We are using countPopulation on a LaneBitmask to determine a number of registers it covers. This is the assumption which does not necessarily need to be true. It is not changed but factored into a single call SIRegisterInfo::getNumCoveredRegs(). Some other places are cleaned up with respect to assumptions about subreg indexes values and tablegen behavior. Differential Revision: https://reviews.llvm.org/D74177	2020-02-06 17:39:24 -08:00
hsmahesha	1d9e08ec35	[AMDGPU] Add file headers for few files where it is missing. Summary: Added file headers for files which implement iterative lightweight scheduling strategies. Which is basically an exercise which I undertook in order to get used to LLVM development process. Reviewers: arsenm, vpykhtin, cdevadas Reviewed By: vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73417	2020-01-31 02:06:41 +05:30
vpykhtin	4332f1a4c8	[AMDGPU] Fix GCN regpressure trackers for INLINEASM instructions. Differential revision: https://reviews.llvm.org/D73338	2020-01-27 17:25:25 +03:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Stanislav Mekhanoshin	9aff33bb95	[AMDGPU] Print register pressure for agpr and vgpr separately Differential Revision: https://reviews.llvm.org/D65476 llvm-svn: 367355	2019-07-30 20:45:15 +00:00
Stanislav Mekhanoshin	e67cc380a8	[AMDGPU] gfx908 mfma support Differential Revision: https://reviews.llvm.org/D64584 llvm-svn: 365824	2019-07-11 21:19:33 +00:00
Valery Pykhtin	7e854e1cdd	[AMDGPU] Speed up live-in virtual register set computaion in GCNScheduleDAGMILive. Differential revision: https://reviews.llvm.org/D62401 llvm-svn: 363661	2019-06-18 11:43:17 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Tom Stellard	5bfbae5cb1	AMDGPU: Refactor Subtarget classes Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851	2018-07-11 20:59:01 +00:00
Stanislav Mekhanoshin	28624f94d5	[AMDGPU] Factored out common part of GCNRPTracker::reset() Differential Revision: https://reviews.llvm.org/D47664 llvm-svn: 333931	2018-06-04 17:21:54 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Matthias Braun	f842297d50	Rename LiveIntervalAnalysis.h to LiveIntervals.h Headers/Implementation files should be named after the class they declare/define. Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in favor of `class LiveIntarvals;` llvm-svn: 320546	2017-12-13 02:51:04 +00:00
Francis Visoiu Mistrih	9d419d3b0c	[CodeGen] Rename functions PrintReg* to printReg* LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168	2017-11-28 12:42:37 +00:00

1 2

64 Commits