llvm-project

Author	SHA1	Message	Date
Rahul Joshi	2a4f5b2751	[NFC][LLVM][CodeGen] Namespace related cleanups (#162999 )	2025-10-13 07:54:50 -07:00
Benjamin Maxwell	8f67cdd9b7	[AArch64][SME] Support split ZPR and PPR area allocation (#142392 ) For a while we have supported the `-aarch64-stack-hazard-size=<size>` option, which adds "hazard padding" between GPRs and FPR/ZPRs. However, there is currently a hole in this mitigation as PPR and FPR/ZPR accesses to the same area also cause streaming memory hazards (this is noted by `-pass-remarks-analysis=sme -aarch64-stack-hazard-remark-size=<val>`), and the current stack layout places PPRs and ZPRs within the same area. Which looks like: ``` ------------------------------------ Higher address \| callee-saved gpr registers \| \|---------------------------------- \| \| lr,fp (a.k.a. "frame record") \| \|-----------------------------------\| <- fp(=x29) \| <hazard padding> \| \|-----------------------------------\| \| callee-saved fp/simd/SVE regs \| \|-----------------------------------\| \| SVE stack objects \| \|-----------------------------------\| \| local variables of fixed size \| \| <FPR> \| \| <hazard padding> \| \| <GPR> \| ------------------------------------\| <- sp \| Lower address ``` With this patch the stack (and hazard padding) is rearranged so that hazard padding is placed between the PPRs and ZPRs rather than within the (fixed size) callee-save region. Which looks something like this: ``` ------------------------------------ Higher address \| callee-saved gpr registers \| \|---------------------------------- \| \| lr,fp (a.k.a. "frame record") \| \|-----------------------------------\| <- fp(=x29) \| callee-saved PPRs \| \| PPR stack objects \| (These are SVE predicates) \|-----------------------------------\| \| <hazard padding> \| \|-----------------------------------\| \| callee-saved ZPR regs \| (These are SVE vectors) \| ZPR stack objects \| Note: FPRs are promoted to ZPRs \|-----------------------------------\| \| local variables of fixed size \| \| <FPR> \| \| <hazard padding> \| \| <GPR> \| ------------------------------------\| <- sp \| Lower address ``` This layout is only enabled if: * SplitSVEObjects are enabled (`-aarch64-split-sve-objects`) - (This may be enabled by default in a later patch) * Streaming memory hazards are present - (`-aarch64-stack-hazard-size=<val>` != 0) * PPRs and FPRs/ZPRs are on the stack * There's no stack realignment or variable-sized objects - This is left as a TODO for now Additionally, any FPR callee-saves that are present will be promoted to ZPRs. This is to prevent stack hazards between FPRs and GRPs in the fixed size callee-save area (which would otherwise require more hazard padding, or moving the FPR callee-saves). This layout should resolve the hole in the hazard padding mitigation, and is not intended change codegen for non-SME code.	2025-10-02 19:05:14 +01:00
Benjamin Maxwell	9f5abd38dd	[Codegen] Add a separate stack ID for scalable predicates (#142390 ) This splits out "ScalablePredicateVector" from the "ScalableVector" StackID this is primarily to allow easy differentiation between vectors and predicates (without inspecting instructions). This new stack ID is not used in many places yet, but will be used in a later patch to mark stack slots that are known to contain predicates. Co-authored-by: Kerry McLaughlin <kerry.mclaughlin@arm.com>	2025-10-02 14:43:07 +01:00
Akshat Oke	a388395b86	[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070 )	2025-04-15 12:37:19 +05:30
Kazu Hirata	735ab61ac8	[CodeGen] Remove unused includes (NFC) (#115996 ) Identified with misc-include-cleaner.	2024-11-12 23:15:06 -08:00
Hari Limaye	e31794f99d	[StackFrameLayoutAnalysis] Support more SlotTypes (#100562 ) Add new SlotTypes to StackFrameLayoutAnalysis to disambiguate Fixed and Variable-Sized stack slots from Variable slots. As Offsets are unreliable for VLA-area objects, sort these to the end of the list - using the Frame Index to ensure a deterministic order when Offsets are equal.	2024-07-25 18:54:24 +01:00
Hari Limaye	dc1c00f6b1	[StackFrameLayoutAnalysis] Use target-specific hook for SP offsets (#100386 ) StackFrameLayoutAnalysis currently calculates SP-relative offsets in a target-independent way via MachineFrameInfo offsets. This is incorrect for some Targets, e.g. AArch64, when there are scalable vector stack slots. This patch adds a virtual function to TargetFrameLowering to provide offsets from SP, with a default implementation matching what is currently used in StackFrameLayoutAnalysis, and refactors StackFrameLayoutAnalysis to use this function. Only non-zero scalable offsets are output by the analysis pass. An implementation of this function is added for AArch64 targets, which aims to provide correct SP offsets in most cases.	2024-07-25 09:03:48 +01:00
David Green	e09032f7a3	[StackFrameLayoutAnalysis] Add basic Scalable stack slot output (#99883 ) The existing StackFrameLayoutAnalysis details do not do well with Scalable vector stack slots, which are not marked as scalable and intertwined with the other fixed-size slots. This patch adds some very basic support, marking them as scalable and sorting them to the end of the list. The slot addresses are not really correct (for fixed as well as scalable), but this prints something a little better with the limited information curently available.	2024-07-22 20:45:18 +01:00
Felipe de Azevedo Piovezan	3db7d0dffb	[MachineFunction][DebugInfo][nfc] Introduce EntryValue variable kind MachineFunction keeps a table of variables whose addresses never change throughout the function. Today, the only kinds of locations it can handle are stack slots. However, we could expand this for variables whose address is derived from the value a register had upon function entry. One case where this happens is with variables alive across coroutine funclets: these can be placed in a coroutine frame object whose pointer is placed in a register that is an argument to coroutine funclets. ``` define @foo(ptr %frame_ptr) { dbg.declare(%frame_ptr, !some_var, !DIExpression(EntryValue, <ptr_arithmetic>)) ``` This is a patch in a series that aims to improve the debug information generated by the CoroSplit pass in the context of `swiftasync` arguments. Variables stored in the coroutine frame _must_ be described the entry_value of the ABI-defined register containing a pointer to the coroutine frame. Since these variables have a single location throughout their lifetime, they are candidates for being stored in the MachineFunction table. Differential Revision: https://reviews.llvm.org/D149879	2023-05-11 07:29:57 -04:00
Paul Kirth	af9a452e57	[llvm][codegen] Fix non-determinism in StackFrameLayoutAnalysisPass output We were iterating over a SmallPtrSet when outputting slot variables. This is still correct but made the test fail under reverse iteration. This patch replaces the SmallPtrSet with a SmallVector. Also remove the "Stack Frame Layout" lines from arm64-opt-remarks-lazy-bfi test, since those also break under reverse iteration. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D142127	2023-01-19 20:04:14 +00:00
Paul Kirth	557a5bc336	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-19 01:51:14 +00:00
Paul Kirth	fdc0bf6adc	Revert "[codegen] Add StackFrameLayoutAnalysisPass" This breaks on some AArch64 bots This reverts commit 0a652c540556a118bbd9386ed3ab7fd9e60a9754.	2023-01-13 22:59:36 +00:00
Paul Kirth	0a652c5405	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-13 20:52:48 +00:00

13 Commits