llvm-project

Author	SHA1	Message	Date
Hiroshi Yamauchi	425d25f5df	[AArch64][WinCFI] Fix a crash due to missing seh directives (#123993 ) https://github.com/llvm/llvm-project/issues/123808	2025-01-24 14:01:41 -08:00
ssijaric-nv	6789442eb2	[AArch64] Fix a corner case with large stack allocation (#122038 ) In the unlikely case where the stack size is greater than 4GB, we may run into the situation where the local stack size and the callee saved registers stack size get combined incorrectly when restoring the callee saved registers. This happens because the stack size in shouldCombineCSRLocalStackBumpInEpilogue is represented as an 'unsigned', but is passed in as an 'int64_t'. We end up with something like $fp, $lr = frame-destroy LDPXi $sp, 536870912 This change just makes 'shouldCombineCSRLocalStackBumpInEpilogue' match 'shouldCombineCSRLocalStackBump' where 'StackBumpBytes' is an 'uint64_t'	2025-01-18 22:09:25 -08:00
Benjamin Maxwell	32a4650f3c	[AArch64] Avoid hardcoding spill size/align in FrameLowering (NFC) (#123080 ) This is already defined for each register class in AArch64RegisterInfo, not hardcoding it here makes these values easier to change (perhaps based on hardware mode).	2025-01-17 10:10:21 +00:00
Benjamin Maxwell	2c9dc089fd	[AArch64] Use spill size when calculating callee saves size (NFC) (#123086 ) This is an NFC right now, as currently, all register and spill sizes are the same, but the spill size is the correct size to use here.	2025-01-17 10:09:31 +00:00
Guy David	1a935d7a17	[llvm] Mark scavenging spill-slots as spilled stack objects. (#122673 ) This seems like an oversight when copying code from other backends.	2025-01-14 10:18:31 +02:00
Oliver Stannard	98b694b660	[AArch64] Fix range check for STGPostIndex (#117146 ) When generating function epilogues using AArch64 stack tagging, we can fold an SP update into the tag-setting loop. The loop tags 32 bytes at a time using ST2G, so the final SP update might be done either by a post indexed STG which tags the final 16 bytes of the tagged region, or by an ADD/SUB instruction after the loop. However, we were only considering the range of the ADD/SUB instructions when deciding whether to do this, and the valid immediate range for STG is slightly lower when the offset is positive, because it is a signed immediate, and must include the extra 16 bytes being tagged.	2024-12-09 09:25:31 +00:00
Oliver Stannard	7d72525909	[AArch64] Fix STG instruction being moved past memcpy (#117191 ) When merging STG instructions used for AArch64 stack tagging, we were stopping on reaching a load or store instruction, but not calls, so it was possible for an STG to be moved past a call to memcpy. This test case (reduced from fuzzer-generated C code) was the result of StackColoring merging allocas A and B into one stack slot, and StackSafetyAnalysis proving that B does not need tagging, so we end up with tagged and untagged objects in the same stack slot. The tagged object (A) is live first, so it is important that it's memory is restored to the background tag before it gets reused to hold B.	2024-12-03 10:32:52 +00:00
Benjamin Maxwell	5248e1d4e1	[AArch64] Fix frame-pointer offset with hazard padding (#118091 ) The `-aarch64-stack-hazard-size=<val>` option disables register paring (as the hazard padding may mean the offset is too large for STP/LDP). This broke setting the frame-pointer offset, as the code to find the frame record looked for a (FP, LR) register pair. This patch resolves this by looking for FP, LR as two unpaired registers when hazard padding is enabled.	2024-12-02 09:32:15 +00:00
Benjamin Maxwell	83c7784c35	[AArch64] Don't emit Neon in streaming[-compatible] functions with -fzero-call-used-regs (#116995 ) Previously, with `-fzero-call-used-regs` clang/LLVM would incorrectly emit Neon instructions in streaming functions, and streaming-compatible functions without SVE. With this change: * In streaming functions, Z/p registers will be zeroed * In streaming compatible functions w/o SVE, D registers will be zeroed - (As Neon vector instructions are illegal including `movi v..`)	2024-11-21 11:02:07 +00:00
Kazu Hirata	a41922ad75	[AArch64] Remove unused includes (NFC) (#115685 ) Identified with misc-include-cleaner.	2024-11-11 07:35:08 -08:00
David Green	8274be509e	[AArch64] Remove header dependencies of AArch64ISelLowering.h. NFC This patch aims to reduce the include used by AArch64ISelLowering, allowing it to be included by unittests so that they can reference the AArch64ISD nodes. It: - Moves the inclusion of AArch64SMEAttributes.h to the uses. - Moves LowerPtrAuthGlobalAddressStatically to a static function, so that AArch64PACKey is not required in the header. - Moves the definitions of getExceptionPointerRegister to the cpp file, to remove the reference of AArch64::X0.	2024-10-28 18:53:37 +00:00
Benjamin Maxwell	ddd463be7e	[AArch64] Add getStreamingHazardSize() to AArch64Subtarget (#113679 ) This is defined by the `-aarch64-streaming-hazard-size` option or its alias `-aarch64-stack-hazard-size` (the original name). It has been renamed to be more general as this option will (for the time being) be used to detect if the current target has streaming mode memory hazards. --------- Co-authored-by: Hari Limaye <hari.limaye@arm.com>	2024-10-28 13:01:22 +00:00
Jack Styles	86f76c3b17	[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171 ) As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced, `DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that the PC has been used with the signing instruction. This change includes three commits - Libunwind support for the newly introduced DWARF Instruction - CodeGen Support for the DWARF Instructions - Reversing the changes made in #96377. Due to `DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed immediately after the signing instruction, this would mean the CFI Instruction location was not consistent with the generated location when not using FEAT_PAuthLR. The commit reverses the changes and makes the location consistent across the different branch protection options. While this does have a code size effect, this is a negligible one. For the ABI information, see here: `853286c7ab/aadwarf64/aadwarf64.rst (id23)`	2024-10-28 08:22:38 +00:00
Alex Rønne Petersen	ad4a582fd9	[llvm] Consistently respect `naked` fn attribute in `TargetFrameLowering::hasFP()` (#106014 ) Some targets (e.g. PPC and Hexagon) already did this. I think it's best to do this consistently so that frontend authors don't run into inconsistent results when they emit `naked` functions. For example, in Zig, we had to change our emit code to also set `frame-pointer=none` to get reliable results across targets. Note: I don't have commit access.	2024-10-18 09:35:42 +04:00
Sander de Smalen	f314e12494	[AArch64][SME] Fix iterator to fixupCalleeSaveRestoreStackOffset (#110855 ) The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be incorrect when it tries to skip over the instructions that get the current value of 'vg', when there is a 'rdsvl' instruction straight after the prologue. That's because it doesn't check that the instruction is still a 'frame-setup' instruction.	2024-10-15 11:56:40 +01:00
CarolineConcatto	a548eded70	[AArch64][SME]Check streaming mode when using SME2 instruction in fra… (#109680 ) …me lowering SME instructions can only be used in streaming mode. PTRUE for predicated counter and the ld/st pair can be used when: sve2.1 is available or sme2 available in function in streaming mode. Previously the frame lowering only checking if sme2 available when building the machine instruction. This fix checks if sme2 is available and is subtarget in streaming mode	2024-09-30 08:42:41 +01:00
Sander de Smalen	db054a1970	[AArch64][SME] Fix ADDVL addressing to scavenged stackslot. (#109674 ) In https://reviews.llvm.org/D159196 we avoided stackslot scavenging when there was no FP available. But in the case where FP is available we need to actually prefer using the FP over the BP. This change affects more than just SME, but it should be a general improvement, since any slot above the (address pointed to by) FP is always closer to FP than BP, so it makes sense to always favour using the FP to address it when the FP is available. This also fixes the issue for SME where this is not just preferred but required.	2024-09-24 13:29:30 +01:00
Lukacma	7f0c5b0502	[AArch64]Fix invalid use of ld1/st1 in stack alloc (#105518 ) This patch fixes incorrect usage of scalar+immediate variant of ld1/st1 instructions during stack allocation caused by [c4bac7f](`c4bac7f7dc`). This commit used ld1/st1 even when stack offset was outside of immediate range for this instruction, producing invalid assembly. This commit was also using incorrect offsets when using ld1/st1.	2024-09-05 14:47:10 +01:00
Amara Emerson	39ec1f79b7	[AArch64] Basic SVE PCS support for handling scalable vectors on Darwin. For the tests I just added +sve instead of what actual hardware has, which is only SME, since otherwise all the test functions need to be marked as streaming mode. rdar://121864771	2024-08-20 17:10:51 -07:00
Kerry McLaughlin	9211977d13	[AArch64][SME] Return false from produceCompactUnwindFrame if VG save required. (#104588 ) The compact unwind format requires all registers are stored in pairs, so return false from produceCompactUnwindFrame if we require saving VG.	2024-08-19 10:17:10 +01:00
Amara Emerson	334a366ba7	[AArch64][Darwin][SME] Don't try to save VG to the stack for unwinding. On Darwin we don't have any hardware that has SVE support, only SME. Therefore we don't need to save VG for unwinders and can safely omit it. This also fixes crashes introduced since this feature landed since Darwin's compact unwind code can't handle the presence of VG anyway. rdar://131072344	2024-08-13 01:46:43 -07:00
Hari Limaye	a98a0dcf63	[AArch64] Add streaming-mode stack hazard optimization remarks (#101695 ) Emit an optimization remark when objects in the stack frame may cause hazards in a streaming mode function. The analysis requires either the `aarch64-stack-hazard-size` or `aarch64-stack-hazard-remark-size` flag to be set by the user, with the former flag taking precedence.	2024-08-06 11:39:01 +01:00
David Green	a3cf8642bf	[AArch64] Cleanup existing values in getMemOpInfo (#98196 ) This patch tries to clean up some of the existing values in getMemOpInfo. All values should now be in bytes (not bits), and the MinOffset/MaxOffset are now always represented unscaled (the immediate that will be present in the final instruction). Although I could not find a place where it altered codegen, the offset of a post-index instruction will be 0, not scale*imm. A IsPostIndexLdStOpcode method has been added to try and make sure that case is handled properly.	2024-08-03 12:31:10 +01:00
Hari Limaye	dc1c00f6b1	[StackFrameLayoutAnalysis] Use target-specific hook for SP offsets (#100386 ) StackFrameLayoutAnalysis currently calculates SP-relative offsets in a target-independent way via MachineFrameInfo offsets. This is incorrect for some Targets, e.g. AArch64, when there are scalable vector stack slots. This patch adds a virtual function to TargetFrameLowering to provide offsets from SP, with a default implementation matching what is currently used in StackFrameLayoutAnalysis, and refactors StackFrameLayoutAnalysis to use this function. Only non-zero scalable offsets are output by the analysis pass. An implementation of this function is added for AArch64 targets, which aims to provide correct SP offsets in most cases.	2024-07-25 09:03:48 +01:00
antangelo	6c9086d13f	[AArch64] Support varargs for preserve_nonecc (#99434 ) Adds varargs support for preserve_none by falling back to C argument passing for the target platform for varargs functions. Fixes #95093	2024-07-21 00:29:18 -04:00
Matt Arsenault	a8a7d62d04	AArch64: Avoid using MachineFunction::getMMI	2024-07-20 13:11:39 +04:00
David Green	ae2e66b03b	[AArch64] Use TargetStackID::ScalableVector instead of hard-coded values. NFC	2024-07-19 08:59:26 +01:00
David Green	4b9bcabdf0	[AArch64] Add streaming-mode stack hazards. (#98956 ) Under some SME contexts, a coprocessor with its own separate cache will be used for FPR operations. This can create hazards if the CPU and the SME unit try to access the same area of memory, including if the access is to an area of the stack. To try to alleviate that, this patch attempts to introduce extra padding into the stack frame between FP and GPR accesses, controlled by the StackHazardSize option. Without changing the layout of the stack frame, a stack object of the right size is added between GPR and FPR CSRs. Another is added to the stack objects section, and stack objects are sorted so that FPR > Hazard padding slot > GPRs (where possible). Unfortunately some things are not handled well (VLA area, FPR arguments on the stack, object with both GPR and FPR accesses), but if those are controlled by the user then the entire stack frame becomes GPR at the start/end with FPR in the middle, surrounded by Hazard padding. This can greatly help reduce something that can be difficult for the user to control themselves. The current implementation is opt-in through an -aarch64-stack-hazard-size flag, and should have no effect if the option is unset. In the long run the implementation might change (for example using more base pointers to separate in more cases, re-enabling ldp/stp using an extra register, etc), but this gets at least something for people to use in llvm-19 if they need it. The only change whilst the option is unset will be a fix for making sure the stack increment is added at the right place when it cannot be converted to postinc (++MBBI). I believe without extra padding that can not normally be reached.	2024-07-18 08:16:40 +01:00
David Green	0d7403184d	[AArch64] Add a AArch64InstrInfo::isFpOrNEON method for checking physical register call. NFC	2024-07-15 08:13:52 +01:00
Amara Emerson	9865171e24	[AArch64] Add -mlr-for-calls-only to replace the now removed -ffixed-x30 flag. (#98073 ) This re-introduces the effective behaviour that was reverted in 7ad481e76c9bee5b9895ebfa0fdb52f31cb7de77. This time we're not using the same mechanism, exposing another reservation feature that prevents only regalloc from using the register, but not for other required uses like ABIs. This also fixes a consequent issue with reserving LR, which is that frame lowering was only adding live-in flags for non-reserved regs. This would cause issues later since the outliner needs accurate flags to determine when LR needs to be preserved. rdar://131313095	2024-07-10 15:16:51 -07:00
David Green	cb48ad6603	[AArch64] Clean up formatting of AArch64FrameLowering. NFC	2024-07-03 16:48:07 +01:00
antangelo	f05fa6e0cf	[AArch64] Fix argument passing in reserved registers for preserve_nonecc (#96259 ) These registers include: - X19, used by LLVM as the base pointer - X15 on Windows, where it is used for stack allocation. It can still be used on Linux/Darwin. - Adjust FrameLowering scratch register code to not assume X9 is available if the calling convention is preserve_nonecc. The code will then pick an unused register as scratch, and allow X9 to continue being used for argument passing.	2024-06-23 17:18:35 -04:00
Kerry McLaughlin	93c8e0f2eb	[AArch64][SME] Save VG for unwind info when changing streaming-mode (#83301 ) If a function requires any streaming-mode change, the vector granule value must be stored to the stack and unwind info must also describe the save of VG to this location. This patch adds VG to the list of callee-saved registers and increases the callee-saved stack size if the function requires streaming-mode changes. A new type is added to RegPairInfo, which is also used to skip restoring the register used to spill the VG value in the epilogue. See https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst	2024-06-13 17:42:11 +01:00
Sander de Smalen	c63a622ba7	[AArch64] Disable red-zone when lowering Q-reg copy through memory. (#94962 ) This was pointed out in PR #93940.	2024-06-11 08:58:28 +01:00
Florian Mayer	4e67f45168	Reapply "[MTE] add stack frame history buffer" In the reverted change, the order of the IR was dependent on the host compiler, because we inserted instructions in arguments to functions. Fix that, and also fix another problem with the test. This reverts commit 3313f28897a87ec313ec0b52ef71c14d3b9ff652.	2024-05-29 13:02:58 -07:00
Florian Mayer	3313f28897	Revert "[MTE] add stack frame history buffer" This reverts commit 1f67f34a5cf993f03eca8936bfb7203778c2997a.	2024-05-29 11:21:29 -07:00
Florian Mayer	1f67f34a5c	[MTE] add stack frame history buffer this will allow us to find offending objects in a symbolization step, like we can do with hwasan. needs matching changes in AOSP: https://android-review.git.corp.google.com/q/topic:%22stackhistorybuffer%22 Pull Request: https://github.com/llvm/llvm-project/pull/86356	2024-05-29 10:57:11 -07:00
Nikita Popov	84314d0ae4	Revert "[AArch64][NFC] Switch to LiveRegUnits (#87313 )" This reverts commit 0f8a74732aa352e5e6dfbf74a53f015b772c5743. PR merged without approval.	2024-05-27 08:29:59 +02:00
AtariDreams	0f8a74732a	[AArch64][NFC] Switch to LiveRegUnits (#87313 )	2024-05-26 19:49:07 -04:00
Jie Fu	5a20a07fce	[AArch64] Fix -Wunused-variable in AArch64FrameLowering.cpp (NFC) llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:3084:31: error: unused variable 'Subtarget' [-Werror,-Wunused-variable] const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>(); ^ llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:3253:31: error: unused variable 'Subtarget' [-Werror,-Wunused-variable] const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>(); ^ 2 errors generated.	2024-05-17 16:37:43 +08:00
CarolineConcatto	c4bac7f7dc	[LLVM][AArch64]Use load/store with consecutive registers in SME2 or S… (#77665 ) …VE2.1 for spill/fill When possible the spill/fill register in Frame Lowering uses the ld/st consecutive pairs available in sme or sve2.1.	2024-05-17 09:25:21 +01:00
Fangrui Song	5a12f2867a	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2024-04-25 17:50:59 -07:00
Kai Nacke	21d177096f	[NFC] Refactor looping over recomputeLiveIns into function (#88040 ) https://github.com/llvm/llvm-project/pull/79940 put calls to recomputeLiveIns into a loop, to repeatedly call the function until the computation converges. However, this repeats a lot of code. This changes moves the loop into a function to simplify the handling. Note that this changes the order in which recomputeLiveIns is called. For example, ``` bool anyChange = false; do { anyChange = recomputeLiveIns(ExitMBB) \|\| recomputeLiveIns(LoopMBB); } while (anyChange); ``` only begins to recompute the live-ins for LoopMBB after the computation for ExitMBB has converged. With this change, all basic blocks have a recomputation of the live-ins for each loop iteration. This can result in less or more calls, depending on the situation.	2024-04-15 17:12:25 -04:00
Jay Foad	7a0e222a17	Revert "Convert many LivePhysRegs uses to LiveRegUnits (#83905 )" This reverts commit 2a13422b8bcee449405e3ebff957b4020805f91c. It was causing test failures on the expensive check builders.	2024-03-07 08:20:26 +00:00
AtariDreams	2a13422b8b	Convert many LivePhysRegs uses to LiveRegUnits (#83905 )	2024-03-06 10:38:14 +05:30
Sander de Smalen	1f99a45012	[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326 ) This patch removes the `-reverse-csr-restore-seq` option from AArch64FrameLowering, since this is no longer used. This patch was reverted because of a crash in PR#79623. Merging it back as it was fixed in PR#82492.	2024-02-22 12:01:53 +00:00
CarolineConcatto	c5253aa136	[AArch64] Restore Z-registers before P-registers (#79623 ) (#82492 ) This is needed by PR#77665[1] that uses a P-register while restoring Z-registers. The reverse for SVE register restore in the epilogue was added to guarantee performance, but further work was done to improve sve frame restore and besides that the schedule also may change the order of the restore, undoing the reverse restore. This also fix the problem reported in (PR #79623) on Windows with std::reverse and .base(). [1]https://github.com/llvm/llvm-project/pull/77665	2024-02-22 09:19:48 +00:00
Momchil Velikov	1a7166833d	[AArch64] Fix stack probing clobbering flags (#81879 ) Certain stack probing sequences might clobber flags, then we can't use a block as a prologue if the flags register is a live-in on entry to that block.	2024-02-21 13:58:04 +00:00
Caroline Concatto	48af281f7a	Revert "[AArch64] Restore Z-registers before P-registers (#79623 )" This reverts commit 3f0404aae7ed2f7138526e1bcd100a60dfe08227. std::reverse is breaking some builds	2024-02-20 18:13:33 +00:00
Caroline Concatto	7af70643ca	Revert "[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326 )" Patch 3f0404aae7ed2 is breaking some debugs build so we cannot use the reverse here. This reverts commit 493f10106f7f1799eb67be95058b251e6a3bf0af.	2024-02-20 18:13:33 +00:00

1 2 3 4 5 ...

379 Commits