llvm-project

Author	SHA1	Message	Date
Mirko Brkušanin	5879162f7f	[AMDGPU] CodeGen for GFX12 VBUFFER instructions (#75492 )	2023-12-15 13:45:03 +01:00
Diana	eb3c02fdc2	[AMDGPU] Use immediates for stack accesses in chain funcs (#71913 ) Switch to using immediate offsets instead of the SP register to access objects on the current stack frame in chain functions. This means we no longer need to reserve a SP register just for accesing stack objects and it also allows us to set the SP (when one is actually needed) to the stack size from the very beginning. This only works if we use a FixedObject for the ScavengeFI, which is what we do for entry functions anyway (and we generally want to keep chain functions close to amdgpu_cs behaviour where we don't have a good reason to diverge).	2023-11-14 13:17:46 +01:00
Jie Fu	fdf99b21f3	[AMDGPU] Fix -Wunused-variable in SIFrameLowering.cpp (NFC) /llvm-project/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:1829:8: error: unused variable 'IsChainFunction' [-Werror,-Wunused-variable] bool IsChainFunction = MF.getInfo<SIMachineFunctionInfo>()->isChainFunction(); ^ 1 error generated.	2023-11-08 20:48:44 +08:00
Jay Foad	d5f3b3b3b1	[RegScavenger] Simplify state tracking for backwards scavenging (#71202 ) Track the live register state immediately before, instead of after, MBBI. This makes it simple to track the state at the start or end of a basic block without a separate (and poorly named) Tracking flag. This changes the API of the backward(MachineBasicBlock::iterator I) method, which now recedes to the state just before, instead of just after, *I. Some clients are simplified by this change. There is one small functional change shown in the lit tests where multiple spilled registers all need to be reloaded before the same instruction. The reloads will now be inserted in the opposite order. This should not affect correctness.	2023-11-08 09:49:07 +00:00
Diana Picus	39830fea28	[AMDGPU][PEI] Set up SP for chain functions Initialize the SP to 0 in the prologue of functions with the `amdgpu_cs_chain` or `amdgpu_cs_chain_preserve` calling conventions, but only if they need one (i.e. if they contain calls to `amdgpu_gfx` functions or if they have stack objects). Also make sure we don't try to realign the stack (since 0 is aligned enough). Differential Revision: https://reviews.llvm.org/D156413	2023-11-08 09:27:34 +01:00
Diana	1fa58c7790	[AMDGPU] Callee saves for amdgpu_cs_chain[_preserve] (#71526 ) Teach prolog epilog insertion how to handle functions with the amdgpu_cs_chain or amdgpu_cs_chain_preserve calling conventions. For amdgpu_cs_chain functions, we only need to preserve the inactive lanes of VGPRs above v8, and only in the presence of calls via @llvm.amdgcn.cs.chain. For amdgpu_cs_chain_preserve functions, we will also need to preserve the active lanes for registers above the last argument VGPR. AFAICT there's no direct way to find out what the last argument VGPR is, so instead the patch uses the fact that chain calls from amdgpu_cs_chain_preserve functions can't use more VGPRs than the caller's VGPR arguments. In other words, it removes the operands of SI_CS_CHAIN_TC instructions from the list of callee saved registers. For both calling conventions, registers v0-v7 never need to be saved and restored, so we should never add them as WWM spills. Differential Revision: https://reviews.llvm.org/D156412	2023-11-08 08:28:15 +01:00
Christudasan Devadasan	f9cd789658	[AMDGPU] Add pseudo instructions for SGPR spill to VGPR (#69923 ) For a future patch, is it important to keep the lowered SGPR spills to be recognized as spill instructions during regalloc. Directly lowering them into V_WRITELANE/V_READLANE won't allow us to attach the SPILL flag to their instructions. This patch introduces the pseudo instructions with the SGPRSpill flag set in their Desc. They will get lowered to equivalent instructions later during post RA pseudo expansion.	2023-10-27 17:24:10 +05:30
Pranav Taneja	6d9b96313d	[AMDGPU] [SIFrameLowering] Use LiveRegUnits instead of LivePhysRegs.	2023-09-21 11:49:46 +05:30
Austin Kerbow	343be5132e	[AMDGPU] Add utilities to track number of user SGPRs. NFC. Factor out and unify some common code that calculates and tracks the number of user SGRPs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D159439	2023-09-12 08:52:30 -07:00
Matt Arsenault	4d42e8b5d1	Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG.	2023-07-31 20:15:45 -04:00
Vitaly Buka	a496c8be6e	Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b. No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D156381	2023-07-26 22:13:32 -07:00
Christudasan Devadasan	7a98f084c4	[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector regclass allocation. This imposes many restrictions that we ended up with unsuccessful SGPR spilling when there won't be enough VGPRs and we are forced to spill the leftover into memory during PEI. The custom spill handling during PEI has many edge cases and often breaks the compiler time to time. This patch implements spilling SGPRs into virtual VGPR lanes. Since we now split the register allocation for SGPRs and VGPRs, the virtual registers introduced for the spill lanes would get allocated automatically in the subsequent regalloc invocation for VGPRs. Spill to virtual registers will always be successful, even in the high-pressure situations, and hence it avoids most of the edge cases during PEI. We are now left with only the custom SGPR spills during PEI for special registers like the frame pointer which is an unproblematic case. Differential Revision: https://reviews.llvm.org/D124196	2023-07-07 23:14:32 +05:30
Christudasan Devadasan	b78b36e1a2	[AMDGPU] Implement whole wave register spill To reduce the register pressure during allocation, when the allocator spills a virtual register that corresponds to a whole wave mode operation, the spill loads and restores should be activated for all lanes by temporarily flipping all bits in exec register to one just before the spills. It is not implemented in the compiler as of today and this patch enables the necessary support. This is a pre-patch before the SGPR spill to virtual VGPR lanes that would eventually causes the whole wave register spills during allocation. Reviewed By: arsenm, cdevadas Differential Revision: https://reviews.llvm.org/D143759	2023-07-07 22:51:45 +05:30
Brendon Cahoon	853b2a84cb	[AMDGPU] Reserve SGPR pair when long branches are present Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the case when an indirect branch target is too far away. The register scavanger may not find available registers, which causes a “did not find scavenging index” assert to occur in assignRegToScavengingIndex. In this patch, we estimate before register allocation whether an indirect branch is likely to be needed, and reserve 2 SGPRs if the branch distance is found to be above a threshold. The distance threshold is an approximation as the exact code size and branch distance are unknown prior to register allocation. Patch by Corbin Robeck. Thanks! Differential Review: https://reviews.llvm.org/D149775	2023-06-29 16:50:46 -05:00
Carl Ritson	d0c0838705	[AMDGPU] Remove return VGPRs from callee save list There is no need to generate spill/restore for registers used in return value. This matters for amdgpu_gfx calling convention where CSR and Ret definitions overlap. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D152892	2023-06-15 14:05:32 +09:00
Sergei Barannikov	74204138f4	[AMDGPU] Check if register is non-null before calling isSubRegisterEq (NFCI) D151036 adds an assertions that prohibits iterating over sub- and super-registers of a null register. This is already the case when iterating over register units of a null register, and worked by accident for sub- and super-registers. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D151289	2023-05-24 10:32:07 +03:00
Akshay Khadse	842dc35fc9	Guard against dereferencing a nullptr In `lib/CodeGen/PrologEpilogInserter.cpp` file, `RS` is assigned via `RS = TRI->requiresRegisterScavenging(MF) ? new RegScavenger() : nullptr;`. This means that `RS` can be `nullptr`. While executing the `TFI->processFunctionBeforeFrameFinalized(MF, RS);`, the `RS` can be dereferenced in the call `RS->enterBasicBlock(MBB);` in file `lib/Target/AMDGPU/SIFrameLowering.cpp` Reviewed By: skan, arsenm Differential Revision: https://reviews.llvm.org/D146791	2023-04-15 11:30:43 +08:00
Christudasan Devadasan	a3028239a7	Revert "[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs" This reverts commit 40ba0942e2ab1107f83aa5a0ee5ae2980bf47b1a.	2022-12-21 16:17:42 +05:30
Carl Ritson	5bc703f755	[AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass Accelerate finding the base class for a physical register by building a statically mapping table from physical registers to base classes using TableGen. Replace uses of SIRegisterInfo::getPhysRegClass with TargetRegisterInfo::getPhysRegBaseClass in order to use the computed table. Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D139422	2022-12-20 16:22:14 +09:00
Haojian Wu	1033289a3f	Fix unused variable warnings in SIFrameLowering.cpp for release build, NFC	2022-12-17 18:24:44 +01:00
Christudasan Devadasan	40ba0942e2	[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector regclass allocation. This imposes many restrictions that we ended up with unsuccessful SGPR spilling when there won't be enough VGPRs and we are forced to spill the leftover into memory during PEI. The custom spill handling during PEI has many edge cases and often breaks the compiler time to time. This patch implements spilling SGPRs into virtual VGPR lanes. Since we now split the register allocation for SGPRs and VGPRs, the virtual registers introduced for the spill lanes would get allocated automatically in the subsequent regalloc invocation for VGPRs. Spill to virtual registers will always be successful, even in the high-pressure situations, and hence it avoids most of the edge cases during PEI. We are now left with only the custom SGPR spills during PEI for special registers like the frame pointer which isn an unproblematic case. This patch also implements the whole wave spills which might occur if RA spills any live range of virtual registers involved in the whole wave operations. Earlier, we had been hand-picking registers for such machine operands. But now with SGPR spills into virtual VGPR lanes, we are exposing them to the allocator. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D124196	2022-12-17 11:56:32 +05:30
Christudasan Devadasan	29247824f5	[AMDGPU][SIFrameLowering] Use the right frame register in CSR spills Unlike the callee-saved VGPR spill instructions emitted by `PEI::spillCalleeSavedRegs`, the CS VGPR spills inserted during emitPrologue/emitEpilogue require the exec bits flipping to avoid clobbering the inactive lanes of VGPRs used for SGPR spilling. Currently, these spill instructions are referenced from the SP at function entry and when the callee performs a stack realignment, they ended up getting incorrect stack offsets. Even if we try to adjust the offsets, the FP-SP becomes a runtime entity with dynamic stack realignment and the offsets would still be inaccurate. To fix it, use FP as the frame base in the spill instructions whenever the function has FP. The offsets obtained for the CS objects would always be the right values from FP. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134949	2022-12-17 11:52:36 +05:30
Christudasan Devadasan	7a72a93580	[AMDGPU] Preserve only the inactive lanes of scratch vgprs In general, a callee is free to use a scratch register without preserving its previous state. However, the VGPR used for SGPR spilling can potentially have its inactive lanes overwritten by the writelane instructions. When the function returns, it can cause unexpected behavior if the VGPR value is not preserved appropriately. The current scheme to preserve the inactive lanes of such scratch VGPRs is not done rightly. It preserves all lanes and causes the outgoing values (if any) getting overwritten by the epilog restores. It then corrupts the return value. To avoid such situation with scratch VGPRs, this patch ensures we preserve only their inactive lanes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134526	2022-12-17 11:51:43 +05:30
Christudasan Devadasan	20a940f1e2	[AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores There is a lot of customization and eventually code duplication in the frame lowering that handles special SGPR spills like the one needed for the Frame Pointer. Incorporating any additional SGPR spill currently makes it difficult during PEI. This patch introduces a new spill builder to efficiently handle such spill requirements. Various spill methods are special handled using a separate class. Reviewed By: sebastian-ne, scott.linder Differential Revision: https://reviews.llvm.org/D132436	2022-12-17 11:50:25 +05:30
Christudasan Devadasan	b25b4c0ab4	[AMDGPU] Separate out SGPR spills to VGPR lanes during PEI SILowerSGPRSpills pass handles the lowering of SGPR spills into VGPR lanes. Some SGPR spills are handled later during PEI. There is a common function used in both places to find the free VGPR lane. This patch eliminates that dependency to find the free VGPR by handling it separately for PEI. It is a prerequisite patch for a future work to allow SGPR spills to virtual VGPR lanes during SILowerSGPRSpills. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D124195	2022-12-17 11:49:41 +05:30
Christudasan Devadasan	5ebe91fcb2	[AMDGPU] Correctly set IsKill flag for VGPR spills in the prolog We always assume the vector register is dead or killed while inserting the VGPR spills in the prolog. It is not always true. Used the entry block liveIn data while setting the flag. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D124194	2022-12-17 11:48:44 +05:30
Christudasan Devadasan	af5e5c40ff	[AMDGPU] Add WWM reserved VGPRs to WWMSpills The custom VGPR spills inserted during frame lowering maintain a separate list for WWM reserved registers. Added them into WWMSpills that already tracks such reserved registers. It unifies the spill insertion. Reviewed By: nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D124193	2022-12-17 11:47:58 +05:30
Christudasan Devadasan	5692a7e84e	[AMDGPU] Callee must always spill writelane VGPRs Since the writelane instruction used for SGPR spills can modify inactive lanes, the callee must preserve the VGPR this instruction modifies even if it was marked Caller-saved. Reviewed By: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D124192	2022-12-17 11:11:42 +05:30
Fangrui Song	67819a72c6	[CodeGen] llvm::Optional => std::optional	2022-12-13 09:06:36 +00:00
Christudasan Devadasan	a8d7ad70aa	[AMDGPU] Skip stack-arg dbg objects while fixing the dead frame indices Both SGPR->VGPR and VGPR->AGPR spilling code give a fixup to the spill frame indices referred in debug instructions so that they can be entirely removed. We should skip the stack argument debug objects while looking inside the bitvector with FI as the index that tracks the spill indices being processed. The stack args will have negative indices and would crash while accessing the bitvector. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D137277	2022-11-04 15:28:35 +05:30
Jay Foad	9bb1e21f07	[AMDGPU] Clean up calls to MachineOperand::setIsDead and friends. NFC.	2022-10-28 10:44:08 +01:00
Venkata Ramanaiah Nalamothu	486594119d	[AMDGPU] Fix prologue/epilogue markers in .debug_line table for trivial functions All the prologue instructions should have unknown source location co-ordinates while the epilogue instructions should have source location of last non-debug instruction after which epilogue instructions are insrted. This ensures the prologue/epilogue markers are generated correctly in the line table. Changes are brought in from the downstream CFI patches. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D131485	2022-08-10 23:00:19 +05:30
Matt Arsenault	dd7e407d81	AMDGPU: Move SpilledReg from MFI to SIRegisterInfo This isn't the most natural place for it, but it avoids a circular include dependency in an out of tree patch.	2022-06-02 17:11:24 -04:00
hsmahesha	5bd87350a5	[AMDGPU] On gfx908, reserve VGPR for AGPR copy based on register budget. Based on available register budget, reserve highest available VGPR for AGPR copy before RA. After RA, shift it to lowest unused VGPR if the one exist. Fixes SWDEV-330006. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D123525	2022-04-21 07:57:26 +05:30
Matt Arsenault	e0d585d75a	AMDGPU: Defer creation of WWM VGPR spill slots There's no reason to create these immediately. They can be created in the prolog/epilog code like CSR spills. There's probably a cleaner way to do this by utilizing the CSR spill code. This makes the frame index used transient state for PrologEpilogInserter, and thus makes serialization easier. Really this doesn't need to be saved here but there isn't really a better place for it.	2022-04-19 21:07:13 -04:00
Christudasan Devadasan	34a68037dd	[AMDGPU][SIFrameLowering] Refactor custom SGPR spills (NFC). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D123666	2022-04-17 13:42:42 +05:30
Venkata Ramanaiah Nalamothu	04fff547e2	[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm, ronlieb Differential Revision: https://reviews.llvm.org/D114652	2022-03-09 12:18:02 +05:30
Sebastian Neubauer	6527b2a4d5	[AMDGPU][NFC] Fix typos Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235	2022-02-18 15:05:21 +01:00
Matt Arsenault	d6fdbbcace	AMDGPU: Add second emergency slot for SGPR to vmem for large frames In a future change, we will sometimes use a VGPR offset for doing spills to memory, in which case we need 2 free VGPRs to do the SGPR spill. In most cases we could spill the VGPR along with the SGPR being spilled, but we don't have any free lanes for SGPR_1024 in wave32 so we could still potentially need a second scavenging slot.	2022-02-02 19:05:05 -05:00
Matt Arsenault	18aabae8e2	AMDGPU: Fix assertion on fixed stack objects with VGPR->AGPR spills These have negative / out of bounds frame index values and would assert when trying to set the BitVector. Fixed stack objects can't be colored away so ignore them.	2022-01-24 09:45:41 -05:00
Austin Kerbow	8470bf2b08	[AMDGPU] Do not reserve any VGPR for SGPR spills After the split register allocation changes in eebe841a47cb it is no longer necessary to reserve a VGPR before RA. This can also create bugs when IPRA is enabled since we cannot predict that a called function may not reserve any register if it does not have any SGPR spills. If that happens those functions may override reserved registers that are normally callee saved. Added a test to show this. Fixes: SWDEV-309900 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D115551	2022-01-11 22:14:59 -08:00
Brendon Cahoon	d45a247998	[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info Removing dead frame indices for VGPR to AGPR spills is incorrect when the frame index is shared by multiple objects, which may occur due to stack slot coloring. The problem is that subsequent code that processes the other object will assert because the stack frame index is marked dead. Removing dead frame indices is needed prior to stack slot coloring, which is what happens with SGPR to VGPR spills. These spills are lowered prior to stack slot coloring, but the VGPR to AGPR spills are processed afterwards during the Prolog/Epilog Inserter pass. This patch marks the VGPR to AGPR spill slot as dead if the slot is not used by another object. Differential Revision: https://reviews.llvm.org/D115996	2021-12-23 11:09:19 -06:00
Matt Arsenault	273a0c8bc9	PrologEpilogInserter: Use explicit control for scavenge slot placement AMDGPU is unusual in that the both stack is indexed in the same direction as stack growth (up). We therefore always need the emergency stack slots placed as low as possible to ensure they are in range of load/store instruction immediate offsets. The existing logic is mostly OK, but failed if we required stack realignment. I don't understand what the existing control isFPCloseToIncomingSP is supposed to mean, but can only be used to stop placing the scavenge slots earlier. Make this explicit so that targets can opt-in rather than opt-out only.	2021-11-23 18:01:12 -05:00
Matt Arsenault	659887b405	AMDGPU: Mark prolog/epilog SCC defs as dead A future change will add SCC liveness checks. Since we are still relying on forward register scavenging, add dead flags to avoid spuriously detecting SCC as live.	2021-11-15 21:35:06 -05:00
Stanislav Mekhanoshin	476ab0f809	[AMDGPU] Fixed stack pointer init with architected flat scratch Even if wave offset is not present we still need to do the rest of the initialization. The mov into s32 was missing in the kernels. Fixes: SWDEV-310935 Differential Revision: https://reviews.llvm.org/D113628	2021-11-10 17:18:38 -08:00
RamNalamothu	539f500e78	[AMDGPU] Do not add debug locations to the code inside prologue There is no real source location for code inside prologue as it is generated by compiler but source locations are being added to code inside prologue as a side effect of https://reviews.llvm.org/D99269 because buildSpillLoadStore() is using source location of the real instruction in the basic block if any. Fixes: SWDEV-307590 Reviewed By: scott.linder, sebastian-ne Differential Revision: https://reviews.llvm.org/D113100	2021-11-04 08:02:41 +05:30
Kazu Hirata	4bef0304e1	[AArch64, AMDGPU] Use make_early_inc_range (NFC)	2021-11-03 09:22:51 -07:00
Jack Andersen	bd4dad87f4	[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand Based on the reasoning of D53903, register operands of DBG_VALUE are invariably treated as RegState::Debug operands. This change enforces this invariant as part of MachineInstr::addOperand so that all passes emit this flag consistently. RegState::Debug is inconsistently set on DBG_VALUE registers throughout LLVM. This runs the risk of a filtering iterator like MachineRegisterInfo::reg_nodbg_iterator to process these operands erroneously when not parsed from MIR sources. This issue was observed in the development of the llvm-mos fork which adds a backend that relies on physical register operands much more than existing targets. Physical RegUnit 0 has the same numeric encoding as $noreg (indicating an undef for DBG_VALUE). Allowing debug operands into the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision of register numbers with different zero semantics). Eventually, this causes an assert where DBG_VALUE instructions are prohibited from participating in live register ranges. Reviewed By: MatzeB, StephenTozer Differential Revision: https://reviews.llvm.org/D110105	2021-10-07 16:08:52 +01:00
Stanislav Mekhanoshin	11b7ee974a	[AMDGPU] Avoid assert for saved FP With spilling into AGPRs enabled we cannot reliably predict if we need to save FP or not. We may finally spill everything into AGPRs and never touch stack. In this case we still may save FP. This is deficiency but not an error, so avoid the assert. Differential Revision: https://reviews.llvm.org/D107404	2021-08-25 09:50:59 -07:00
RamNalamothu	1a8c57179a	[AMDGPU] We would need FP if there is call and caller save VGPR spills Since https://reviews.llvm.org/D98319, determineCalleeSavesSGPR() needs to consider caller save VGPR spills as well while anticipating if we require FP. Fixes: SWDEV-295978 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D106758	2021-07-28 11:12:55 +05:30

1 2 3 4 5

203 Commits