llvm-project

Author	SHA1	Message	Date
Matthew Devereau	51e3d2f73d	[AArch64][SME] Conditionally do smstart/smstop (#77113 ) This patch adds conditional enabling/disabling of streaming mode for functions which have both the aarch64_pstate_sm_compatible and aarch64_pstate_sm_body attributes. This combination allows callees to determine if switching streaming mode is required instead of relying on the caller.	2024-01-18 09:17:23 +00:00
Tomas Matheson	7bd17212ef	Re-land "[AArch64] Codegen support for FEAT_PAuthLR" (#75947 ) This reverts commit 9f0f5587426a4ff24b240018cf8bf3acc3c566ae. Fix expensive checks failure by properly marking register def for ADR.	2023-12-21 18:32:55 +00:00
Tomas Matheson	9f0f558742	Revert "[AArch64] Codegen support for FEAT_PAuthLR" This reverts commit 5992ce90b8c0fac06436c3c86621fbf6d5398ee5. Builtbot failures with expensive checks enabled.	2023-12-21 16:25:55 +00:00
Tomas Matheson	5992ce90b8	[AArch64] Codegen support for FEAT_PAuthLR - Adds a new +pc option to -mbranch-protection that will enable the use of PC as a diversifier in PAC branch protection code. - When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions (pacibsppc, retaasppc, etc) are used. Documentation for the relevant instructions can be found here: https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/ Co-authored-by: Lucas Prates <lucas.prates@arm.com>	2023-12-21 14:18:33 +00:00
Momchil Velikov	cc944f502f	[AArch64] Stack probing for function prologues (#66524 ) This adds code to AArch64 function prologues to protect against stack clash attacks by probing (writing to) the stack at regular enough intervals to ensure that the guard page cannot be skipped over. The patch depends on and maintains the following invariants: Upon function entry the caller guarantees that it has probed the stack (e.g. performed a store) at some address [sp, #N], where`0 <= N <= 1024`. This invariant comes from a requirement for compatibility with GCC. Any address range in the allocated stack, no smaller than stack-probe-size bytes contains at least one probe At any time the stack pointer is above or in the guard page Probes are performed in descreasing address order The stack-probe-size is a function attribute that can be set by a platform to correspond to the guard page size. By default, the stack probe size is 4KiB, which is a safe default as this is the smallest possible page size for AArch64. Linux uses a 64KiB guard for AArch64, so this can be overridden by the stack-probe-size function attribute. For small frames without a frame pointer (<= 240 bytes), no probes are needed. For larger frame sizes, LLVM always stores x29 to the stack. This serves as an implicit stack probe. Thus, while allocating stack objects the compiler assumes that the stack has been probed at [sp]. There are multiple probing sequences that can be emitted, depending on the size of the stack allocation: A straight-line sequence of subtracts and stores, used when the allocation size is smaller than 5 guard pages. A loop allocating and probing one page size per iteration, plus at most a single probe to deal with the remainder, used when the allocation size is larger but still known at compile time. A loop which moves the SP down to the target value held in a register (or a loop, moving a scratch register to the target value help in SP), used when the allocation size is not known at compile-time, such as when allocating space for SVE values, or when over-aligning the stack. This is emitted in AArch64InstrInfo because it will also be used for dynamic allocas in a future patch. A single probe where the amount of stack adjustment is unknown, but is known to be less than or equal to a page size. --------- Co-authored-by: Oliver Stannard <oliver.stannard@linaro.org>	2023-11-30 17:41:51 +00:00
Anatoly Trosinenko	1d2b558265	[AArch64][PAC] Check authenticated LR value during tail call When performing a tail call, check the value of LR register after authentication to prevent the callee from signing and spilling an untrusted value. This commit implements a few variants of check, more can be added later. If it is safe to assume that executable pages are always readable, LR can be checked just by dereferencing the LR value via LDR. As an alternative, LR can be checked as follows: ; lowered AUT* instruction ; <some variant of check that LR contains a valid address> b.cond break_block ret_block: ; lowered TCRETURN break_block: brk 0xc471 As the existing methods either break the compatibility with execute-only memory mappings or can degrade the performance, they are disabled by default and can be explicitly enabled with a command line option. Individual subtargets can opt-in to use one of the available methods by updating AArch64FrameLowering::getAuthenticatedLRCheckMethod(). Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D156716	2023-10-11 17:38:17 +03:00
Sander de Smalen	9e9be99c97	[AArch64][SME] Disable remat of VL-dependent ops when function changes streaming mode. This is a way to prevent the register allocator from inserting instructions which behave differently for different runtime vector-lengths, inside a call-sequence which changes the streaming-SVE mode before/after the call. I've considered using BUNDLEs in Machine IR, but found that using this is not possible for a few reasons: * Most passes don't look inside BUNDLEs, but some passes would need to look inside these call-sequence bundles, for example the PrologEpilog pass (to remove the CALLSEQSTART/END), a PostRA pass to remove COPY instructions, or the AArch64PseudoExpand pass. * Within the streaming-mode-changing call sequence, one of the instructions is a CALLSEQEND. The corresponding CALLSEQBEGIN (AArch64::ADJCALLSTACKUP) is outside this sequence. This means we'd end up with a BUNDLE that has [SMSTART, COPY, BL, ADJCALLSTACKUP, COPY, SMSTOP]. The MachineVerifier doesn't accept this, and we also can't move the CALLSEQSTART into the call sequence. Maybe in the future we could model this differently by modelling the runtime vector-length as a value that's used by certain operations (similar to e.g. NCZV flags) and clobbered by SMSTART/MMSTOP, such that the register allocator can consider these as actual dependences and avoid rematerialization. For now we just want to address the immediate problem. Reviewed By: paulwalker-arm, aemerson Differential Revision: https://reviews.llvm.org/D159193	2023-09-01 12:13:27 +00:00
Matt Arsenault	69e75ae695	CodeGen: Don't lazily construct MachineFunctionInfo This fixes what I consider to be an API flaw I've tripped over multiple times. The point this is constructed isn't well defined, so depending on where this is first called, you can conclude different information based on the MachineFunction. For example, the AMDGPU implementation inspected the MachineFrameInfo on construction for the stack objects and if the frame has calls. This kind of worked in SelectionDAG which visited all allocas up front, but broke in GlobalISel which hasn't visited any of the IR when arguments are lowered. I've run into similar problems before with the MIR parser and trying to make use of other MachineFunction fields, so I think it's best to just categorically disallow dependency on the MachineFunction state in the constructor and to always construct this at the same time as the MachineFunction itself. A missing feature I still could use is a way to access an custom analysis pass on the IR here.	2022-12-21 10:49:32 -05:00
Matt Arsenault	588ecc11b8	AArch64: Stop storing MachineFunction in MachineFunctionInfo The constructor should not depend on the MachineFunction	2022-12-16 12:30:03 -05:00
Krzysztof Parzyszek	c589730ad5	[YAML] Convert Optional to std::optional	2022-12-06 12:49:32 -08:00
Fangrui Song	4b1b9e22b3	Remove unused #include "llvm/ADT/Optional.h"	2022-12-05 04:21:08 +00:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Sander de Smalen	a7a0b825bd	[AArch64] NFC: Remove unused parameter from allocateLazySaveBuffer	2022-11-09 15:36:00 +00:00
Kerry McLaughlin	f7f44f018f	[AArch64][SME] Set up a lazy-save/restore around calls. Setting up a lazy-save mechanism around calls is done during SelectionDAG because calls to intrinsics may be expanded into an actual function call (e.g. calls to @llvm.cos()), and maintaining an allowed-list in the SMEABI pass is not feasible. The approach for conditionally restoring the lazy-save based on the runtime value of TPIDR2_EL0 is similar to how we handle conditional smstart/smstop. We create a pseudo-node which gets expanded into a conditional branch and expands to a call to __arm_tpidr2_restore(%tpidr2_object_ptr). The lazy-save buffer and TPIDR2 block are only allocated once at the start of the function. For each call, the TPIDR2 block is initialised, and at the end of the call, a pseudo node (RestoreZA) is planted. Patch by Sander de Smalen. Differential Revision: https://reviews.llvm.org/D133900	2022-10-05 14:36:53 +01:00
Eli Friedman	5637ec0983	[ARM64EC 4/?] Add LLVM support for varargs calling convention. Part of patchset to add initial support for ARM64EC. The ARM64EC calling convention is the same as ARM64 for non-varargs functions, but for varargs, the convention is significantly different. Basically, only x0-x3 registers are used for passing arguments, and x4 and x5 describe the address/size of the arguments passed in memory. (See https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi for more details; see https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention for the x64 calling convention rules, which this convention needs to match.) Note that this currently doesn't handle i128 arguments correctly; as noted in review, that's sort of complicated to handle, so I'm leaving it for a followup. Differential Revision: https://reviews.llvm.org/D125415	2022-09-05 13:05:48 -07:00
Matt Devereau	5166345f50	[SVE][AArch64] Refine hasSVEArgsOrReturn As described in aapcs64 (https://github.com/ARM-software/abi-aa/blob/2022Q1/aapcs64/aapcs64.rst#scalable-vector-registers) AAVPCS is used only when registers z0-z7 take an SVE argument. This fixes the case where floats occupy the lower bits of registers z0-z7 but SVE arguments in registers greater than z7 cause a function to use AAVPCS where it should use AAPCS. Moving SVE function deduction from AArch64RegisterInfo::hasSVEArgsOrReturn to AArch64TargetLowering::LowerFormalArguments where physical register lowering is more accurate fixes this. Differential Revision: https://reviews.llvm.org/D127209	2022-07-01 13:24:55 +00:00
Florian Mayer	0593ce5f0b	[MC] Add 'G' to augmentation string for MTE instrumented functions This was agreed on in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html The thread proposed two options * add a character to augmentation string and handle in libuwind * use a separate personality function. It was determined that this is the simpler and better option. This is part of ARM's Aarch64 ABI: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22 The next step after this is teaching libunwind to untag when this augmentation character is set. Reviewed By: MaskRay, eugenis Differential Revision: https://reviews.llvm.org/D127007	2022-06-08 12:36:32 -07:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Matt Arsenault	53f3f2bbb1	AArch64: Use Register	2022-04-19 21:07:04 -04:00
Momchil Velikov	d0ea42a7c1	[AArch64] Async unwind - function epilogues Reviewed By: MaskRay, chill Differential Revision: https://reviews.llvm.org/D112330	2022-04-12 16:50:50 +01:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Momchil Velikov	63c9aca12a	Revert "[AArch64] Async unwind - function epilogues" This reverts commit 74319d67943a4fbef36e81f54273549ce4962f84. It causes test failures that look like infinite loop in asan/hwasan unwinding.	2022-03-02 15:01:57 +00:00
Momchil Velikov	74319d6794	[AArch64] Async unwind - function epilogues Counterpart of https://reviews.llvm.org/D111411 this change makes the unwind information instruction precise in function epilogues. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D112330	2022-03-02 13:15:11 +00:00
Momchil Velikov	25e92920c9	[AArch64] Async unwind - helper functions to decide on CFI emission Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D112327	2022-02-24 18:16:50 +00:00
Tim Northover	82a0e808bb	IR/AArch64/X86: add "swifttailcc" calling convention. Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean "tailcc", but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of "swiftcc" and "tailcc". Support is added for AArch64 and X86 for now.	2021-05-17 10:48:34 +01:00
Tim Northover	ea0eec69f1	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
Bradley Smith	ea834c8365	Revert "[AArch64][SVE] Allow accesses to SVE stack objects to use frame pointer" This patch introduced codegen faults. An attempt to fix this was done in https://reviews.llvm.org/D97193, but ultimately it was decided to approach this differently. This reverts commit 42635856ed3c9a05957640f9deb50cf865c03825. Differential Revision: https://reviews.llvm.org/D98350	2021-03-11 13:32:35 +00:00
Bradley Smith	42635856ed	[AArch64][SVE] Allow accesses to SVE stack objects to use frame pointer The layout of the stack frame for SVE means that using the frame pointer rather than the stack pointer for an access to an SVE stack object removes the need for an additional add to jump over the non-SVE objects. Likewise the opposite is true for non-SVE stack objects. This patch allows for the former to be done by having HasFP return true in the presence of both SVE and non-SVE stack objects, and also fixes a minor issue whereby the later would not be done for certain offsets.	2021-01-28 12:39:57 +00:00
Evgenii Stepanov	2f63e57fa5	[MTE] Pin the tagged base pointer to one of the stack slots. Summary: Pin the tagged base pointer to one of the stack slots, and (if necessary) rewrite tag offsets so that an object that occupies that slot has both address and tag offsets of 0. This allows ADDG instructions for that object to be eliminated and their uses replaced with the tagged base pointer itself. This optimization must be done in machine instructions and not in the IR instrumentation pass, because referring to a stack slot through an IRG pointer would confuse the stack coloring pass. The optimization makes a (pretty naive) attempt to find the slot that would benefit the most by counting the uses of stack slots in the function. Reviewers: ostannard, pcc Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72365	2020-10-15 12:50:16 -07:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00
Tim Northover	2afe4becec	AArch64: make sure jump table entries can reach entire image This turns all jump table entries into deltas within the target function because in the small memory model all code & static data must be in a 4GB block somewhere in memory. When the entries were a delta between the table location and a basic block, the 32-bit signed entries are not enough to guarantee reachability. https://reviews.llvm.org/D87286	2020-09-18 09:50:40 +01:00
Simon Pilgrim	9f830e0af7	AArch64MachineFunctionInfo.h - remove unnecessary TargetFrameLowering.h include. NFCI.	2020-09-10 16:05:33 +01:00
Owen Anderson	5987da8764	Revert "Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain"" This reverts commit bc9a29b9ee6ade4894252b1470977142c32b4602. The reasoning that this patch was wrong was itself incorrect (see discussion on llvm-commits). This patch does seem to be exposing a latent SVE code generation bug on non-public tests, which should not block a correctness fix for public, non-SVE use cases.	2020-09-01 19:29:03 +00:00
Paul Walker	bc9a29b9ee	Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain" This reverts commit e9d9a612084b47fc4277523561d61e675370c854. This patch was previously revert by 04879086b44348cad600a0a1ccbe1f7776cc3cf9 with the reapplication being done after breaking the assert used to ensure SP is always 16-byte aligned, which is a requirement of the AAPCS. For extra context the latest patch caused runtime failures when building with "-march=armv8-a+sve -mllvm -aarch64-sve-vector-bits-min=256".	2020-09-01 16:09:37 +01:00
Owen Anderson	e9d9a61208	Reapply D70800: Fix AArch64 AAPCS frame record chain Original Commit Message: After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset. Patch By: logan Differential Revision: D70800	2020-08-27 17:29:41 +00:00
Martin Storsjö	04879086b4	Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain" This reverts commit 9936455204fd6ab72715cc9d67385ddc93e072ed. That commit caused failed assertions e.g. like this: $ cat alloca.c a; b() { float c; d(); a = __builtin_alloca(d); c = e(); f(a); return c; } $ clang -target aarch64-linux-gnu -c alloca.c -O2 clang: ../lib/Target/AArch64/AArch64InstrInfo.cpp:3446: void llvm::emitFrameOffset(llvm::MachineBasicBlock&, llvm::MachineBasicBlock::iterator, const llvm::DebugLoc&, unsigned int, unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo, llvm::MachineInstr::MIFlag, bool, bool, bool): Assertion `(DestReg != AArch64::SP \|\| Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned"' failed.	2020-08-27 09:39:56 +03:00
Owen Anderson	9936455204	Reapply D70800: Fix AArch64 AAPCS frame record chain Original Commit Message: After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset. Patch By: logan	2020-08-26 19:38:38 +00:00
Owen Anderson	9061eb8245	Revert "Fix frame pointer layout on AArch64 Linux." This broke stage2 of clang-cmake-aarch64-full. This reverts commit a0aed80b22d1b698b86e0c16109fdfd4d592756f.	2020-08-26 17:17:14 +00:00
Owen Anderson	a0aed80b22	Fix frame pointer layout on AArch64 Linux. When floating point callee-saved registers were used, the frame pointer would incorrectly point to the bottom of the CSR space (containing saved floating-point registers), rather than to the frame record. While all frame offsets were calculated consistently, resulting in working code, this prevented stack walkers from being about to traverse the frame list.	2020-08-26 16:09:49 +00:00
Andrew Litteken	1488bef8fc	[MachineOutliner] Annotation for outlined functions in AArch64 - Adding changes to support comments on outlined functions with outlining for the conditions through which it was outlined (e.g. Thunks, Tail calls) - Adapts the emitFunctionHeader to print out a comment next to the header if the target specifies it based on information in MachineFunctionInfo - Adds mir test for function annotiation Differential Revision: https://reviews.llvm.org/D78062	2020-04-20 13:33:31 -07:00
Jessica Paquette	66037b84cf	MachineFunctionInfo for AArch64 in MIR Starting with hasRedZone adding MachineFunctionInfo to be put in the YAML for MIR files. Split out of: D78062 Based on implementation for MachineFunctionInfo for WebAssembly Differential Revision: https://reviews.llvm.org/D78173 Patch by Andrew Litteken! (AndrewLitteken)	2020-04-17 15:16:59 -07:00
Logan Chien	061a94e4e2	Revert "AArch64: Fix frame record chain" Breaks aosp-O3-polly-before-vectorizer-unprofitable with the following error message: void llvm::emitFrameOffset(llvm::MachineBasicBlock &, MachineBasicBlock::iterator, const llvm::DebugLoc &, unsigned int, unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo , MachineInstr::MIFlag, bool, bool, bool ): Assertion `(DestReg != AArch64::SP \|\| Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned"' failed. This reverts commit d4e10e6adb1b629b3fc1b78f7e281fbcec392edb.	2019-12-14 13:58:40 -08:00
Logan Chien	d4e10e6adb	AArch64: Fix frame record chain The commit r369122 may keep LR and FP register (aka. frame record) in the middle of a frame, thus we must add the offsets to ensure the FP register always points to innermost frame record on the stack. According to AAPCS64[1], a conforming code shall construct a linked list of stack frames that can be traversed with frame records. This commit is also essential to frame-pointer-based stack unwinder (e.g. the stack unwinder in linx-perf-tools.) [1] https://github.com/ARM-software/software-standards/blob/master/abi/aapcs64/aapcs64.rst#the-frame-pointer Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64/framelayout-frame-record.ll Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64 Differential Revision: https://reviews.llvm.org/D70800	2019-12-14 10:23:20 -08:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Sander de Smalen	3367686b4d	[AArch64] Extend storeRegToStackSlot to spill SVE registers. This patch allows the register allocator to spill SVE registers to the stack. Reviewers: ostannard, efriedma, rengolin, cameron.mcinally Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70082	2019-11-13 10:09:32 +00:00
Simon Pilgrim	0cc7c29a97	AArch64FunctionInfo - fix uninitialized variable warnings. NFCI.	2019-11-11 11:24:09 +00:00
Sander de Smalen	84a0c8e3ae	[AArch64][SVE] Spilling/filling of SVE callee-saves. Implement the spills/fills of callee-saved SVE registers using STR and LDR instructions. Also adds the `aarch64_sve_vector_pcs` attribute to specify the callee-saved registers to be used for functions that return SVE vectors or take SVE vectors as arguments. The callee-saved registers are vector registers z8-z23 and predicate registers p4-p15. The overal frame-layout with SVE will be as follows: +-------------+ \| stack args \| +-------------+ \| Callee Saves\| \| X29, X30 \| \|-------------\| <- FP \| SVE Callee \| < ////////////// \| saved regs \| < ////////////// \| z23 \| < ////////////// \| : \| < // SCALABLE // \| z8 \| < ////////////// \| p15 \| < /// STACK //// \| : \| < ////////////// \| p4 \| < //// AREA //// +-------------+ < ////////////// \| : \| < ////////////// \| SVE locals \| < ////////////// \| : \| < ////////////// +-------------+ \|/////////////\| alignment gap. \| : \| \| Stack objs \| \| : \| +-------------+ <- SP after call and frame-setup Reviewers: cameron.mcinally, efriedma, greened, thegameg, ostannard, rengolin Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D68996	2019-11-11 09:03:19 +00:00
Sander de Smalen	d6a7da80aa	Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with EXPENSIVE_CHECKS enabled, causing the patch to be reverted in rG2c496bb5309c972d59b11f05aee4782ddc087e71. This patch relands the patch with a proper fix to the live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the callee-saves correctly so that the CalleeSaved info is taken from MIR directly, rather than letting it be recalculated by the PEI pass. I've done this by running `llc -stop-before=prologepilog` on the LLVM IR as captured in the test files, adding the extra MOV instructions that were manually added in the original test file, then running `llc -run-pass=prologepilog` and finally re-added the comments for the MOV instructions.	2019-10-29 16:13:07 +00:00

1 2

82 Commits