llvm-project

Author	SHA1	Message	Date
David Green	8274be509e	[AArch64] Remove header dependencies of AArch64ISelLowering.h. NFC This patch aims to reduce the include used by AArch64ISelLowering, allowing it to be included by unittests so that they can reference the AArch64ISD nodes. It: - Moves the inclusion of AArch64SMEAttributes.h to the uses. - Moves LowerPtrAuthGlobalAddressStatically to a static function, so that AArch64PACKey is not required in the header. - Moves the definitions of getExceptionPointerRegister to the cpp file, to remove the reference of AArch64::X0.	2024-10-28 18:53:37 +00:00
Benjamin Maxwell	ddd463be7e	[AArch64] Add getStreamingHazardSize() to AArch64Subtarget (#113679 ) This is defined by the `-aarch64-streaming-hazard-size` option or its alias `-aarch64-stack-hazard-size` (the original name). It has been renamed to be more general as this option will (for the time being) be used to detect if the current target has streaming mode memory hazards. --------- Co-authored-by: Hari Limaye <hari.limaye@arm.com>	2024-10-28 13:01:22 +00:00
Jack Styles	86f76c3b17	[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171 ) As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced, `DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that the PC has been used with the signing instruction. This change includes three commits - Libunwind support for the newly introduced DWARF Instruction - CodeGen Support for the DWARF Instructions - Reversing the changes made in #96377. Due to `DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed immediately after the signing instruction, this would mean the CFI Instruction location was not consistent with the generated location when not using FEAT_PAuthLR. The commit reverses the changes and makes the location consistent across the different branch protection options. While this does have a code size effect, this is a negligible one. For the ABI information, see here: `853286c7ab/aadwarf64/aadwarf64.rst (id23)`	2024-10-28 08:22:38 +00:00
Alex Rønne Petersen	ad4a582fd9	[llvm] Consistently respect `naked` fn attribute in `TargetFrameLowering::hasFP()` (#106014 ) Some targets (e.g. PPC and Hexagon) already did this. I think it's best to do this consistently so that frontend authors don't run into inconsistent results when they emit `naked` functions. For example, in Zig, we had to change our emit code to also set `frame-pointer=none` to get reliable results across targets. Note: I don't have commit access.	2024-10-18 09:35:42 +04:00
Sander de Smalen	f314e12494	[AArch64][SME] Fix iterator to fixupCalleeSaveRestoreStackOffset (#110855 ) The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be incorrect when it tries to skip over the instructions that get the current value of 'vg', when there is a 'rdsvl' instruction straight after the prologue. That's because it doesn't check that the instruction is still a 'frame-setup' instruction.	2024-10-15 11:56:40 +01:00
CarolineConcatto	a548eded70	[AArch64][SME]Check streaming mode when using SME2 instruction in fra… (#109680 ) …me lowering SME instructions can only be used in streaming mode. PTRUE for predicated counter and the ld/st pair can be used when: sve2.1 is available or sme2 available in function in streaming mode. Previously the frame lowering only checking if sme2 available when building the machine instruction. This fix checks if sme2 is available and is subtarget in streaming mode	2024-09-30 08:42:41 +01:00
Sander de Smalen	db054a1970	[AArch64][SME] Fix ADDVL addressing to scavenged stackslot. (#109674 ) In https://reviews.llvm.org/D159196 we avoided stackslot scavenging when there was no FP available. But in the case where FP is available we need to actually prefer using the FP over the BP. This change affects more than just SME, but it should be a general improvement, since any slot above the (address pointed to by) FP is always closer to FP than BP, so it makes sense to always favour using the FP to address it when the FP is available. This also fixes the issue for SME where this is not just preferred but required.	2024-09-24 13:29:30 +01:00
Lukacma	7f0c5b0502	[AArch64]Fix invalid use of ld1/st1 in stack alloc (#105518 ) This patch fixes incorrect usage of scalar+immediate variant of ld1/st1 instructions during stack allocation caused by [c4bac7f](`c4bac7f7dc`). This commit used ld1/st1 even when stack offset was outside of immediate range for this instruction, producing invalid assembly. This commit was also using incorrect offsets when using ld1/st1.	2024-09-05 14:47:10 +01:00
Amara Emerson	39ec1f79b7	[AArch64] Basic SVE PCS support for handling scalable vectors on Darwin. For the tests I just added +sve instead of what actual hardware has, which is only SME, since otherwise all the test functions need to be marked as streaming mode. rdar://121864771	2024-08-20 17:10:51 -07:00
Kerry McLaughlin	9211977d13	[AArch64][SME] Return false from produceCompactUnwindFrame if VG save required. (#104588 ) The compact unwind format requires all registers are stored in pairs, so return false from produceCompactUnwindFrame if we require saving VG.	2024-08-19 10:17:10 +01:00
Amara Emerson	334a366ba7	[AArch64][Darwin][SME] Don't try to save VG to the stack for unwinding. On Darwin we don't have any hardware that has SVE support, only SME. Therefore we don't need to save VG for unwinders and can safely omit it. This also fixes crashes introduced since this feature landed since Darwin's compact unwind code can't handle the presence of VG anyway. rdar://131072344	2024-08-13 01:46:43 -07:00
Hari Limaye	a98a0dcf63	[AArch64] Add streaming-mode stack hazard optimization remarks (#101695 ) Emit an optimization remark when objects in the stack frame may cause hazards in a streaming mode function. The analysis requires either the `aarch64-stack-hazard-size` or `aarch64-stack-hazard-remark-size` flag to be set by the user, with the former flag taking precedence.	2024-08-06 11:39:01 +01:00
David Green	a3cf8642bf	[AArch64] Cleanup existing values in getMemOpInfo (#98196 ) This patch tries to clean up some of the existing values in getMemOpInfo. All values should now be in bytes (not bits), and the MinOffset/MaxOffset are now always represented unscaled (the immediate that will be present in the final instruction). Although I could not find a place where it altered codegen, the offset of a post-index instruction will be 0, not scale*imm. A IsPostIndexLdStOpcode method has been added to try and make sure that case is handled properly.	2024-08-03 12:31:10 +01:00
Hari Limaye	dc1c00f6b1	[StackFrameLayoutAnalysis] Use target-specific hook for SP offsets (#100386 ) StackFrameLayoutAnalysis currently calculates SP-relative offsets in a target-independent way via MachineFrameInfo offsets. This is incorrect for some Targets, e.g. AArch64, when there are scalable vector stack slots. This patch adds a virtual function to TargetFrameLowering to provide offsets from SP, with a default implementation matching what is currently used in StackFrameLayoutAnalysis, and refactors StackFrameLayoutAnalysis to use this function. Only non-zero scalable offsets are output by the analysis pass. An implementation of this function is added for AArch64 targets, which aims to provide correct SP offsets in most cases.	2024-07-25 09:03:48 +01:00
antangelo	6c9086d13f	[AArch64] Support varargs for preserve_nonecc (#99434 ) Adds varargs support for preserve_none by falling back to C argument passing for the target platform for varargs functions. Fixes #95093	2024-07-21 00:29:18 -04:00
Matt Arsenault	a8a7d62d04	AArch64: Avoid using MachineFunction::getMMI	2024-07-20 13:11:39 +04:00
David Green	ae2e66b03b	[AArch64] Use TargetStackID::ScalableVector instead of hard-coded values. NFC	2024-07-19 08:59:26 +01:00
David Green	4b9bcabdf0	[AArch64] Add streaming-mode stack hazards. (#98956 ) Under some SME contexts, a coprocessor with its own separate cache will be used for FPR operations. This can create hazards if the CPU and the SME unit try to access the same area of memory, including if the access is to an area of the stack. To try to alleviate that, this patch attempts to introduce extra padding into the stack frame between FP and GPR accesses, controlled by the StackHazardSize option. Without changing the layout of the stack frame, a stack object of the right size is added between GPR and FPR CSRs. Another is added to the stack objects section, and stack objects are sorted so that FPR > Hazard padding slot > GPRs (where possible). Unfortunately some things are not handled well (VLA area, FPR arguments on the stack, object with both GPR and FPR accesses), but if those are controlled by the user then the entire stack frame becomes GPR at the start/end with FPR in the middle, surrounded by Hazard padding. This can greatly help reduce something that can be difficult for the user to control themselves. The current implementation is opt-in through an -aarch64-stack-hazard-size flag, and should have no effect if the option is unset. In the long run the implementation might change (for example using more base pointers to separate in more cases, re-enabling ldp/stp using an extra register, etc), but this gets at least something for people to use in llvm-19 if they need it. The only change whilst the option is unset will be a fix for making sure the stack increment is added at the right place when it cannot be converted to postinc (++MBBI). I believe without extra padding that can not normally be reached.	2024-07-18 08:16:40 +01:00
David Green	0d7403184d	[AArch64] Add a AArch64InstrInfo::isFpOrNEON method for checking physical register call. NFC	2024-07-15 08:13:52 +01:00
Amara Emerson	9865171e24	[AArch64] Add -mlr-for-calls-only to replace the now removed -ffixed-x30 flag. (#98073 ) This re-introduces the effective behaviour that was reverted in 7ad481e76c9bee5b9895ebfa0fdb52f31cb7de77. This time we're not using the same mechanism, exposing another reservation feature that prevents only regalloc from using the register, but not for other required uses like ABIs. This also fixes a consequent issue with reserving LR, which is that frame lowering was only adding live-in flags for non-reserved regs. This would cause issues later since the outliner needs accurate flags to determine when LR needs to be preserved. rdar://131313095	2024-07-10 15:16:51 -07:00
David Green	cb48ad6603	[AArch64] Clean up formatting of AArch64FrameLowering. NFC	2024-07-03 16:48:07 +01:00
antangelo	f05fa6e0cf	[AArch64] Fix argument passing in reserved registers for preserve_nonecc (#96259 ) These registers include: - X19, used by LLVM as the base pointer - X15 on Windows, where it is used for stack allocation. It can still be used on Linux/Darwin. - Adjust FrameLowering scratch register code to not assume X9 is available if the calling convention is preserve_nonecc. The code will then pick an unused register as scratch, and allow X9 to continue being used for argument passing.	2024-06-23 17:18:35 -04:00
Kerry McLaughlin	93c8e0f2eb	[AArch64][SME] Save VG for unwind info when changing streaming-mode (#83301 ) If a function requires any streaming-mode change, the vector granule value must be stored to the stack and unwind info must also describe the save of VG to this location. This patch adds VG to the list of callee-saved registers and increases the callee-saved stack size if the function requires streaming-mode changes. A new type is added to RegPairInfo, which is also used to skip restoring the register used to spill the VG value in the epilogue. See https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst	2024-06-13 17:42:11 +01:00
Sander de Smalen	c63a622ba7	[AArch64] Disable red-zone when lowering Q-reg copy through memory. (#94962 ) This was pointed out in PR #93940.	2024-06-11 08:58:28 +01:00
Florian Mayer	4e67f45168	Reapply "[MTE] add stack frame history buffer" In the reverted change, the order of the IR was dependent on the host compiler, because we inserted instructions in arguments to functions. Fix that, and also fix another problem with the test. This reverts commit 3313f28897a87ec313ec0b52ef71c14d3b9ff652.	2024-05-29 13:02:58 -07:00
Florian Mayer	3313f28897	Revert "[MTE] add stack frame history buffer" This reverts commit 1f67f34a5cf993f03eca8936bfb7203778c2997a.	2024-05-29 11:21:29 -07:00
Florian Mayer	1f67f34a5c	[MTE] add stack frame history buffer this will allow us to find offending objects in a symbolization step, like we can do with hwasan. needs matching changes in AOSP: https://android-review.git.corp.google.com/q/topic:%22stackhistorybuffer%22 Pull Request: https://github.com/llvm/llvm-project/pull/86356	2024-05-29 10:57:11 -07:00
Nikita Popov	84314d0ae4	Revert "[AArch64][NFC] Switch to LiveRegUnits (#87313 )" This reverts commit 0f8a74732aa352e5e6dfbf74a53f015b772c5743. PR merged without approval.	2024-05-27 08:29:59 +02:00
AtariDreams	0f8a74732a	[AArch64][NFC] Switch to LiveRegUnits (#87313 )	2024-05-26 19:49:07 -04:00
Jie Fu	5a20a07fce	[AArch64] Fix -Wunused-variable in AArch64FrameLowering.cpp (NFC) llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:3084:31: error: unused variable 'Subtarget' [-Werror,-Wunused-variable] const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>(); ^ llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:3253:31: error: unused variable 'Subtarget' [-Werror,-Wunused-variable] const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>(); ^ 2 errors generated.	2024-05-17 16:37:43 +08:00
CarolineConcatto	c4bac7f7dc	[LLVM][AArch64]Use load/store with consecutive registers in SME2 or S… (#77665 ) …VE2.1 for spill/fill When possible the spill/fill register in Frame Lowering uses the ld/st consecutive pairs available in sme or sve2.1.	2024-05-17 09:25:21 +01:00
Fangrui Song	5a12f2867a	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2024-04-25 17:50:59 -07:00
Kai Nacke	21d177096f	[NFC] Refactor looping over recomputeLiveIns into function (#88040 ) https://github.com/llvm/llvm-project/pull/79940 put calls to recomputeLiveIns into a loop, to repeatedly call the function until the computation converges. However, this repeats a lot of code. This changes moves the loop into a function to simplify the handling. Note that this changes the order in which recomputeLiveIns is called. For example, ``` bool anyChange = false; do { anyChange = recomputeLiveIns(ExitMBB) \|\| recomputeLiveIns(LoopMBB); } while (anyChange); ``` only begins to recompute the live-ins for LoopMBB after the computation for ExitMBB has converged. With this change, all basic blocks have a recomputation of the live-ins for each loop iteration. This can result in less or more calls, depending on the situation.	2024-04-15 17:12:25 -04:00
Jay Foad	7a0e222a17	Revert "Convert many LivePhysRegs uses to LiveRegUnits (#83905 )" This reverts commit 2a13422b8bcee449405e3ebff957b4020805f91c. It was causing test failures on the expensive check builders.	2024-03-07 08:20:26 +00:00
AtariDreams	2a13422b8b	Convert many LivePhysRegs uses to LiveRegUnits (#83905 )	2024-03-06 10:38:14 +05:30
Sander de Smalen	1f99a45012	[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326 ) This patch removes the `-reverse-csr-restore-seq` option from AArch64FrameLowering, since this is no longer used. This patch was reverted because of a crash in PR#79623. Merging it back as it was fixed in PR#82492.	2024-02-22 12:01:53 +00:00
CarolineConcatto	c5253aa136	[AArch64] Restore Z-registers before P-registers (#79623 ) (#82492 ) This is needed by PR#77665[1] that uses a P-register while restoring Z-registers. The reverse for SVE register restore in the epilogue was added to guarantee performance, but further work was done to improve sve frame restore and besides that the schedule also may change the order of the restore, undoing the reverse restore. This also fix the problem reported in (PR #79623) on Windows with std::reverse and .base(). [1]https://github.com/llvm/llvm-project/pull/77665	2024-02-22 09:19:48 +00:00
Momchil Velikov	1a7166833d	[AArch64] Fix stack probing clobbering flags (#81879 ) Certain stack probing sequences might clobber flags, then we can't use a block as a prologue if the flags register is a live-in on entry to that block.	2024-02-21 13:58:04 +00:00
Caroline Concatto	48af281f7a	Revert "[AArch64] Restore Z-registers before P-registers (#79623 )" This reverts commit 3f0404aae7ed2f7138526e1bcd100a60dfe08227. std::reverse is breaking some builds	2024-02-20 18:13:33 +00:00
Caroline Concatto	7af70643ca	Revert "[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326 )" Patch 3f0404aae7ed2 is breaking some debugs build so we cannot use the reverse here. This reverts commit 493f10106f7f1799eb67be95058b251e6a3bf0af.	2024-02-20 18:13:33 +00:00
Sander de Smalen	493f10106f	[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326 ) This patch removes the `-reverse-csr-restore-seq` option from AArch64FrameLowering, since this is no longer used.	2024-02-20 15:08:06 +00:00
CarolineConcatto	3f0404aae7	[AArch64] Restore Z-registers before P-registers (#79623 ) This is needed by PR#77665[1] that uses a P-register while restoring Z-registers. The reverse for SVE register restore in the epilogue was added to guarantee performance, but further work was done to improve sve frame restore and besides that the schedule also may change the order of the restore, undoing the reverse restore. [1]https://github.com/llvm/llvm-project/pull/77665	2024-02-19 13:39:24 +00:00
Momchil Velikov	658e4763a2	[AArch64] Fix wrong condition in `canUseAsPrologue` (#81878 ) Inline stack probing code may need a scratch register, hence basic blocks where such register is not available cannot be used as prologues. Checking for an available scratch regidster was incorrectly skipped when the function uses stack probing.	2024-02-19 10:40:21 +00:00
Hiroshi Yamauchi	692566a8b2	Fix an assert failure with a funclet in a swifttailcc function. (#78806 ) The failure happens in the livedebugvalues pass.	2024-02-15 15:54:03 -08:00
Oskar Wirga	ff4636a4ab	Refactor recomputeLiveIns to converge on added MachineBasicBlocks (#79940 ) This is a fix for the regression seen in https://github.com/llvm/llvm-project/pull/79498 > Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. Now we do not recompute the entire CFG but we do ensure that the newly added MBB do reach convergence.	2024-01-30 19:33:04 -08:00
Nikita Popov	07a1925b8b	Revert "Refactor recomputeLiveIns to operate on whole CFG (#79498 )" This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b. Introduces a major compile-time regression.	2024-01-26 22:33:17 +01:00
Oskar Wirga	59bf60519f	Refactor recomputeLiveIns to operate on whole CFG (#79498 ) Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness.	2024-01-26 11:25:36 -08:00
Mikael Holmen	90c326b198	[AArch64] Fix gcc warning about mix of enumeral and non-enumeral types [NFC] Change the return type of findScratchNonCalleeSaveRegister to Register instead of unsigned. Every place the function is called we already put the returned value in a Register variable or compare it with another Register. This fixes some gcc warnings: ../lib/Target/AArch64/AArch64FrameLowering.cpp:744: warning: enumeral and non-enumeral type in conditional expression [-Wextra] 743 \| Register TargetReg = RealignmentPadding \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 744 \| ? findScratchNonCalleeSaveRegister(&MBB) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 745 \| : AArch64::SP; \| ../lib/Target/AArch64/AArch64FrameLowering.cpp:803: warning: enumeral and non-enumeral type in conditional expression [-Wextra] 802 \| Register ScratchReg = RealignmentPadding \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 803 \| ? findScratchNonCalleeSaveRegister(&MBB) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 804 \| : AArch64::SP; \|	2024-01-25 07:56:16 +01:00
Eli Friedman	a6065f0fa5	Arm64EC entry/exit thunks, consolidated. (#79067 ) This combines the previously posted patches with some additional work I've done to more closely match MSVC output. Most of the important logic here is implemented in AArch64Arm64ECCallLowering. The purpose of the AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for other targets, and generate most of the Arm64EC-specific bits: generating thunks, mangling symbols, generating aliases, and generating the .hybmp$x table. This is all done late for a few reasons: to consolidate the logic as much as possible, and to ensure the IR exposed to optimization passes doesn't contain complex arm64ec-specific constructs. The other changes are supporting changes, to handle the new constructs generated by that pass. There's a global llvm.arm64ec.symbolmap representing the .hybmp$x entries for the thunks. This gets handled directly by the AsmPrinter because it needs symbol indexes that aren't available before that. There are two new calling conventions used to represent calls to and from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few changes to handle the associated exception-handling info, SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX. I've intentionally left out handling for structs with small non-power-of-two sizes, because that's easily separated out. The rest of my current work is here. I squashed my current patches because they were split in ways that didn't really make sense. Maybe I could split out some bits, but it's hard to meaningfully test most of the parts independently. Thanks to @dpaoliello for extensive testing and suggestions. (Originally posted as https://reviews.llvm.org/D157547 .)	2024-01-22 21:28:07 -08:00
Florian Hahn	58dcac3948	[AArch64] Check X16&X17 in prologue if the fn has an SwiftAsyncContext. (#73945 ) StoreSwiftAsyncContext clobbers X16 & X17. Make sure they are available in canUseAsPrologue, to avoid shrink wrapping moving the pseudo to a place where X16 or X17 are live.	2023-12-05 11:41:40 +00:00

1 2 3 4 5 ...

369 Commits