336 Commits

Author SHA1 Message Date
Jay Foad
7a0e222a17 Revert "Convert many LivePhysRegs uses to LiveRegUnits (#83905)"
This reverts commit 2a13422b8bcee449405e3ebff957b4020805f91c.

It was causing test failures on the expensive check builders.
2024-03-07 08:20:26 +00:00
AtariDreams
2a13422b8b
Convert many LivePhysRegs uses to LiveRegUnits (#83905) 2024-03-06 10:38:14 +05:30
Sander de Smalen
1f99a45012 [AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326)
This patch removes the `-reverse-csr-restore-seq` option from
AArch64FrameLowering, since this is no longer used.

This patch was reverted because of a crash in PR#79623.
Merging it back as it was fixed in PR#82492.
2024-02-22 12:01:53 +00:00
CarolineConcatto
c5253aa136
[AArch64] Restore Z-registers before P-registers (#79623) (#82492)
This is needed by PR#77665[1] that uses a P-register while restoring
Z-registers.

The reverse for SVE register restore in the epilogue was added to
guarantee performance, but further work was done to improve sve frame
restore and besides that the schedule also may change the order of the
restore, undoing the reverse restore.

This also fix the problem reported in (PR #79623) on Windows with
std::reverse and .base().

[1]https://github.com/llvm/llvm-project/pull/77665
2024-02-22 09:19:48 +00:00
Momchil Velikov
1a7166833d
[AArch64] Fix stack probing clobbering flags (#81879)
Certain stack probing sequences might clobber flags, then we can't use a
block as a prologue if the flags register is a live-in on entry to that
block.
2024-02-21 13:58:04 +00:00
Caroline Concatto
48af281f7a Revert "[AArch64] Restore Z-registers before P-registers (#79623)"
This reverts commit 3f0404aae7ed2f7138526e1bcd100a60dfe08227.

std::reverse is breaking some builds
2024-02-20 18:13:33 +00:00
Caroline Concatto
7af70643ca Revert "[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326)"
Patch  3f0404aae7ed2 is breaking some debugs build so we cannot use the reverse here.

This reverts commit 493f10106f7f1799eb67be95058b251e6a3bf0af.
2024-02-20 18:13:33 +00:00
Sander de Smalen
493f10106f
[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326)
This patch removes the `-reverse-csr-restore-seq` option from
AArch64FrameLowering, since this is no longer used.
2024-02-20 15:08:06 +00:00
CarolineConcatto
3f0404aae7
[AArch64] Restore Z-registers before P-registers (#79623)
This is needed by PR#77665[1] that uses a P-register while restoring
Z-registers.

The reverse for SVE register restore in the epilogue was added to
guarantee performance, but further work was done to improve sve frame
restore and besides that the schedule also may change the order of the
restore, undoing the reverse restore.

[1]https://github.com/llvm/llvm-project/pull/77665
2024-02-19 13:39:24 +00:00
Momchil Velikov
658e4763a2
[AArch64] Fix wrong condition in canUseAsPrologue (#81878)
Inline stack probing code may need a scratch register, hence basic
blocks where such register is not available cannot be used as prologues.

Checking for an available scratch regidster was incorrectly skipped when
the function uses stack probing.
2024-02-19 10:40:21 +00:00
Hiroshi Yamauchi
692566a8b2
Fix an assert failure with a funclet in a swifttailcc function. (#78806)
The failure happens in the livedebugvalues pass.
2024-02-15 15:54:03 -08:00
Oskar Wirga
ff4636a4ab
Refactor recomputeLiveIns to converge on added MachineBasicBlocks (#79940)
This is a fix for the regression seen in
https://github.com/llvm/llvm-project/pull/79498

> Currently, the way that recomputeLiveIns works is that it will
recompute the livein registers for that MachineBasicBlock but it matters
what order you call recomputeLiveIn which can result in incorrect
register allocations down the line.

Now we do not recompute the entire CFG but we do ensure that the newly
added MBB do reach convergence.
2024-01-30 19:33:04 -08:00
Nikita Popov
07a1925b8b Revert "Refactor recomputeLiveIns to operate on whole CFG (#79498)"
This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b.

Introduces a major compile-time regression.
2024-01-26 22:33:17 +01:00
Oskar Wirga
59bf60519f
Refactor recomputeLiveIns to operate on whole CFG (#79498)
Currently, the way that recomputeLiveIns works is that it will recompute
the livein registers for that MachineBasicBlock but it matters what
order you call recomputeLiveIn which can result in incorrect register
allocations down the line.

This PR fixes that by simply recomputing the liveins for the entire CFG
until convergence is achieved. This makes it harder to introduce subtle
bugs which alter liveness.
2024-01-26 11:25:36 -08:00
Mikael Holmen
90c326b198 [AArch64] Fix gcc warning about mix of enumeral and non-enumeral types [NFC]
Change the return type of
 findScratchNonCalleeSaveRegister
to Register instead of unsigned.

Every place the function is called we already put the returned value in a
Register variable or compare it with another Register.

This fixes some gcc warnings:
 ../lib/Target/AArch64/AArch64FrameLowering.cpp:744: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
   743 |     Register TargetReg = RealignmentPadding
       |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   744 |                              ? findScratchNonCalleeSaveRegister(&MBB)
       |                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   745 |                              : AArch64::SP;
       |
 ../lib/Target/AArch64/AArch64FrameLowering.cpp:803: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
   802 |     Register ScratchReg = RealignmentPadding
       |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   803 |                               ? findScratchNonCalleeSaveRegister(&MBB)
       |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   804 |                               : AArch64::SP;
       |
2024-01-25 07:56:16 +01:00
Eli Friedman
a6065f0fa5
Arm64EC entry/exit thunks, consolidated. (#79067)
This combines the previously posted patches with some additional work
I've done to more closely match MSVC output.

Most of the important logic here is implemented in
AArch64Arm64ECCallLowering. The purpose of the
AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for
other targets, and generate most of the Arm64EC-specific bits:
generating thunks, mangling symbols, generating aliases, and generating
the .hybmp$x table. This is all done late for a few reasons: to
consolidate the logic as much as possible, and to ensure the IR exposed
to optimization passes doesn't contain complex arm64ec-specific
constructs.

The other changes are supporting changes, to handle the new constructs
generated by that pass.

There's a global llvm.arm64ec.symbolmap representing the .hybmp$x
entries for the thunks. This gets handled directly by the AsmPrinter
because it needs symbol indexes that aren't available before that.

There are two new calling conventions used to represent calls to and
from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few
changes to handle the associated exception-handling info,
SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX.

I've intentionally left out handling for structs with small
non-power-of-two sizes, because that's easily separated out. The rest of
my current work is here. I squashed my current patches because they were
split in ways that didn't really make sense. Maybe I could split out
some bits, but it's hard to meaningfully test most of the parts
independently.

Thanks to @dpaoliello for extensive testing and suggestions.

(Originally posted as https://reviews.llvm.org/D157547 .)
2024-01-22 21:28:07 -08:00
Florian Hahn
58dcac3948
[AArch64] Check X16&X17 in prologue if the fn has an SwiftAsyncContext. (#73945)
StoreSwiftAsyncContext clobbers X16 & X17. Make sure they are available
in canUseAsPrologue, to avoid shrink wrapping moving the pseudo to a
place where X16 or X17 are live.
2023-12-05 11:41:40 +00:00
Jie Fu
e9c6f3f5e7 [AArch64] Fix -Wunused-variable in AArch64FrameLowering.cpp (NFC)
llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:497:21:
error: unused variable 'MFI' [-Werror,-Wunused-variable]
  MachineFrameInfo &MFI = MF.getFrameInfo();
                    ^
1 error generated.
2023-12-02 18:32:20 +08:00
Momchil Velikov
b1806e6a1f
[AArch64] Stack probing for dynamic allocas in SelectionDAG (#66525)
Add support for probing for dynamic allocas (variable-size objects and
outgoing stack arguments).

Co-authored-by: Oliver Stannard <oliver.stannard@linaro.org>
2023-12-02 10:09:41 +00:00
Momchil Velikov
cc944f502f
[AArch64] Stack probing for function prologues (#66524)
This adds code to AArch64 function prologues to protect against stack
clash attacks by probing (writing to) the stack at regular enough
intervals to ensure that the guard page cannot be skipped over.

The patch depends on and maintains the following invariants:

Upon function entry the caller guarantees that it has probed the stack
(e.g. performed a store) at some address [sp, #N], where`0 <= N <=
1024`. This invariant comes from a requirement for compatibility with
GCC. Any address range in the allocated stack, no smaller than
stack-probe-size bytes contains at least one probe At any time the stack
pointer is above or in the guard page Probes are performed in
descreasing address order
The stack-probe-size is a function attribute that can be set by a
platform to correspond to the guard page size.

By default, the stack probe size is 4KiB, which is a safe default as
this is the smallest possible page size for AArch64. Linux uses a 64KiB
guard for AArch64, so this can be overridden by the stack-probe-size
function attribute.

For small frames without a frame pointer (<= 240 bytes), no probes are
needed.

For larger frame sizes, LLVM always stores x29 to the stack. This serves
as an implicit stack probe. Thus, while allocating stack objects the
compiler assumes that the stack has been probed at [sp].

There are multiple probing sequences that can be emitted, depending on
the size of the stack allocation:

A straight-line sequence of subtracts and stores, used when the
allocation size is smaller than 5 guard pages. A loop allocating and
probing one page size per iteration, plus at most a single probe to deal
with the remainder, used when the allocation size is larger but still
known at compile time. A loop which moves the SP down to the target
value held in a register (or a loop, moving a scratch register to the
target value help in SP), used when the allocation size is not known at
compile-time, such as when allocating space for SVE values, or when
over-aligning the stack. This is emitted in AArch64InstrInfo because it
will also be used for dynamic allocas in a future patch. A single probe
where the amount of stack adjustment is unknown, but is known to be less
than or equal to a page size.

---------

Co-authored-by: Oliver Stannard <oliver.stannard@linaro.org>
2023-11-30 17:41:51 +00:00
David Green
4d80122598
[AArch64] Teach areMemAccessesTriviallyDisjoint about scalable widths. (#73655)
The base change here is to change getMemOperandWithOffsetWidth to return
a TypeSize Width, which in turn allows areMemAccessesTriviallyDisjoint
to reason about trivially disjoint widths.
2023-11-30 16:54:28 +00:00
Jie Fu
7cf26e0c6d [AArch64] Fix -Wunused-function of getLivePhysRegsUpTo in AArch64FrameLowering.cpp (NFC)
/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1428:13: error: unused function 'getLivePhysRegsUpTo' [-Werror,-Wunused-function]
static void getLivePhysRegsUpTo(MachineInstr &MI, const TargetRegisterInfo &TRI,
            ^
1 error generated.
2023-11-23 09:06:48 +08:00
David Blaikie
1cd682f26b Sink variable into #ifndef NDEBUG where it is used
(addresses -Wunused-variable warning)
2023-11-22 19:09:14 +00:00
Florian Hahn
a842430c20
[AArch64] Add check that prologue insertion doesn't clobber live regs. (#71826)
This patch extends AArch64FrameLowering::emitProglogue to check if the
inserted prologue clobbers live registers.

It updates `llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir`
with an extra load to make x9 live before the store, preserving the
original test.

It uses the original
`llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir` as
`llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir`,
because there x9 is marked as live on entry, but used as scratch reg as
it is not callee saved.

The new assertion catches a mis-compile in
`store-swift-async-context-clobber-live-reg.ll` on
https://github.com/apple/llvm-project/tree/next
2023-11-22 16:49:33 +00:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Momchil Velikov
dedf2c6bb5
[AArch64] Refactor allocation of locals and stack realignment (#72028)
Factor out some stack allocation in a separate function. This patch
splits out the generic portion of a larger refactoring done as a part of
stack clash protection support.

The patch is almost, but not quite NFC. The only difference should
be that where we have adjacent allocation of stack space
for local SVE objects and non-local SVE objects the order
of `sub sp, ...` and `addvl sp, ...` instructions is reversed, because now
it's done with a single call to `emitFrameOffset` and it happens
add/subtract the fixed part before the scalable part, e.g.

    addvl sp, sp, #-2
    sub sp, sp, #16, lsl #12
    sub sp, sp, #16

becomes

    sub sp, sp, #16, lsl #12
    sub sp, sp, #16
    addvl sp, sp, #-2
2023-11-15 09:27:01 +00:00
Karthika Devi C
6726c99f88
[AArch64] Fix tryMergeAdjacentSTG function in PrologEpilog pass (#68873)
The tryMergeAdjacentSTG function tries to merge multiple
stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV
flag before moving around STGloop which also alters NZCV flags. This was
not issue before the patch 5e612bc as these stack tag stores does not
alter the NZCV flags. But after the change, this merge function leads to
miscompilation because of control flow change in instructions. Added the
check to to see if the first instruction after insert point reads or
writes to NZCV flag and it's liveout state. This check happens after the
filling of merge list just before merge and bails out if necessary.
2023-11-14 14:43:33 -08:00
Jay Foad
d5f3b3b3b1
[RegScavenger] Simplify state tracking for backwards scavenging (#71202)
Track the live register state immediately before, instead of after,
MBBI. This makes it simple to track the state at the start or end of a
basic block without a separate (and poorly named) Tracking flag.

This changes the API of the backward(MachineBasicBlock::iterator I)
method, which now recedes to the state just before, instead of just
after, *I. Some clients are simplified by this change.

There is one small functional change shown in the lit tests where
multiple spilled registers all need to be reloaded before the same
instruction. The reloads will now be inserted in the opposite order.
This should not affect correctness.
2023-11-08 09:49:07 +00:00
Zhaoxuan Jiang
041a786c78
[AArch64] Fix pairing different types of registers when computing CSRs. (#66642)
If a function has odd number of same type of registers to save, and the
calling convention also requires odd number of such type of CSRs, an FP
register would be accidentally marked as saved when producePairRegisters
returns true.

This patch also fixes the AArch64LowerHomogeneousPrologEpilog pass not
handling AArch64::NoRegister; actually this pass must be fixed along
with the register pairing so i can write a test for it.
2023-10-16 23:34:04 -07:00
Anatoly Trosinenko
1d2b558265 [AArch64][PAC] Check authenticated LR value during tail call
When performing a tail call, check the value of LR register after
authentication to prevent the callee from signing and spilling an
untrusted value. This commit implements a few variants of check,
more can be added later.

If it is safe to assume that executable pages are always readable,
LR can be checked just by dereferencing the LR value via LDR.

As an alternative, LR can be checked as follows:

    ; lowered AUT* instruction
    ; <some variant of check that LR contains a valid address>
    b.cond break_block
  ret_block:
    ; lowered TCRETURN
  break_block:
    brk 0xc471

As the existing methods either break the compatibility with execute-only
memory mappings or can degrade the performance, they are disabled by
default and can be explicitly enabled with a command line option.

Individual subtargets can opt-in to use one of the available methods
by updating AArch64FrameLowering::getAuthenticatedLRCheckMethod().

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D156716
2023-10-11 17:38:17 +03:00
Hiroshi Yamauchi
0ecd8846ae
[AArch64][Win] Emit SEH instructions for the swift async context-related instructions in the prologue and the epilogue. (#66967)
This fixes an error from checkARM64Instructions() in MCWin64EH.cpp.
2023-09-28 09:43:39 -07:00
Zhaoxuan Jiang
baf3903218
[AArch64] Bail out of HomogeneousPrologEpilog for functions with swif… (#67417)
…tasync argument

swiftasync introduces a number of frame adjustments which is
incompatible with current implementation of HomogeneousPrologEpilog
pass.
2023-09-26 08:42:01 -07:00
Anatoly Trosinenko
eb02ee44d3 [AArch64] Move PAuth codegen down the machine pipeline
To simplify handling PAuth in the machine outliner, introduce a
separate AArch64PointerAuth pass that is executed after both
Prologue/Epilogue Inserter and Machine Outliner passes.

After moving to AArch64PointerAuth, signLR and authenticateLR are
not used outside of their class anymore, so make them private and
simplify accordingly.

The new pass is added via AArch64PassConfig::addPostBBSections(),
so that it can change the code size before branch relaxation occurs.
AArch64BranchTargets is placed there too, so it can take into account
any PACI(A|B)SP instructions and not excessively add BTIs at the start
of functions.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D159357
2023-09-22 14:49:14 +03:00
Bill Wendling
9e41c284e0
[NFC][CodeGen] Create method to clear registers (#66958)
Place the architecuture-specific logic to clear registers in a single
place and call it via a TargetInstrInfo method.

This will allow one to add instructions to clear registers holding the
stack protector guard value before return, but do it in
non-architecture-specific code.
2023-09-21 15:57:35 -07:00
Sander de Smalen
702c3f56d3 [SME] Don't scavenge a spillslot in callee-save area in presence of streaming-mode changes.
If no frame-pointer is available and the compiler has scavenged a
spill-slot in the callee-save area, the compiler may be forced to emit an
'addvl' inside the streaming-mode-changing call sequence when it needs to
fill (reload) an FP register being passed to the call.

We can avoid this entirely by disabling stack-slot scavenging when there
are streaming-mode-changing call-sequences in the function.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D159196
2023-09-04 10:14:44 +00:00
Hiroshi Yamauchi
8942d3047c [AArch64][WinCFI] Handle cases where no SEH opcodes in the prologue
but there are some in the epilogue.

Make a decision whether or not to have a startepilogue/endepilogue
based on whether we actually insert SEH opcodes in the epilogue,
rather than whether we had SEH opcodes in the prologue or not.

This fixes an assert failure when there are no SEH opcodes in the
prologue but there are SEH opcodes in the epilogue (for example, when
there is no stack frame but there are stack arguments) which was not
covered in https://reviews.llvm.org/D88641.

Assertion failed: HasWinCFI == MF.hasWinCFI(), file C:\Users\hiroshi\llvm-project\llvm\lib\Target\AArch64\AArch64FrameLowering.cpp, line 1988

Differential Revision: https://reviews.llvm.org/D159238
2023-08-31 12:43:26 -07:00
Martin Storsjö
cd09089549 [AArch64] Fix a couple comment typos. NFC. 2023-08-19 00:28:19 +03:00
Anatoly Trosinenko
81300f75f4 [AArch64][PAC] Remove the duplication of LR sign/auth implementations
In the machine outliner implementation for AArch64, `signOutlinedFunction()`
reimplements signing the LR value in prologue and authenticating it in
epilogue of the outlined function. This patch factors out `signLR()` and
`authenticateLR()` functions from AArch64FrameLowering code and reuses
them in `signOutlinedFunction()`.

The `mergeOutliningCandidateAttributes()` outliner callback is
introduced as well to further unify signing and authentication of the LR
value.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D157320
2023-08-11 14:39:18 +03:00
Oliver Stannard
f2e7285b03 [AArch64][PtrAuth] Fix unwind state for tail calls
When generating unwind tables for code which uses return-address
signing, we need to toggle the RA_SIGN_STATE DWARF register around any
tail-calls, because these require the return address to be authenticated
before the call, and could throw an exception. This is done using the
.cfi_negate_ra_state directive before the call, and .cfi_restore_state
at the start of the next basic block.

However, since D153098, the .cfi_restore_state isn't being inserted,
because the CFIFixup pass isn't being run. This re-enables that pass
when return-adress signing is enabled.

Reviewed By: ikudrin, MaskRay

Differential Revision: https://reviews.llvm.org/D156428
2023-08-03 11:45:51 +01:00
Hiroshi Yamauchi
a90228b911 [AArch64][Windows] Fix the slot offset of the swift async context register.
This fixes a code gen issue where savings the swift async context
register (x22) accidentally overwrites the saved value of another
callee-saved register, corrupts its value and causes a crash.

Differential Revision: https://reviews.llvm.org/D156391
2023-07-27 12:32:43 -07:00
Martin Storsjö
20b7584455 Reland [AArch64] Fix an immediate out of range for large realignments on Windows
Also add a missing FrameSetup flag on the existing add instruction.

This fixes https://github.com/llvm/llvm-project/issues/63701.

Since the previous iteration, change ADDXrr to ADDXrx64, which
works with this use of SP.

Differential Revision: https://reviews.llvm.org/D155447
2023-07-19 11:19:04 +03:00
Martin Storsjö
793a349e6f Revert "[AArch64] Fix an immediate out of range for large realignments on Windows"
This reverts commit b1d0bc0f4395c69097bc11b6ba8f821f621272a9.

Builds with expensive checks show that 'sp' isn't a valid register
in ADDXrr - an object file built without exprnsive checks enabled
disassembles as "add x15, xzr, x16", instead of the intended
"add x15, sp, x16".
2023-07-18 18:21:23 +03:00
Martin Storsjö
b1d0bc0f43 [AArch64] Fix an immediate out of range for large realignments on Windows
Also add a missing FrameSetup flag on the existing add instruction.

This fixes https://github.com/llvm/llvm-project/issues/63701.

Differential Revision: https://reviews.llvm.org/D155447
2023-07-18 15:56:36 +03:00
Igor Kudrin
6e54fccede [AArch64] Emit fewer CFI instructions for synchronous unwind tables
The instruction-precise, or asynchronous, unwind tables usually take up
much more space than the synchronous ones. If a user is concerned about
the load size of the program and does not need the features provided
with the asynchronous tables, the compiler should be able to generate
the more compact variant.

This patch changes the generation of CFI instructions for these cases so
that they all come in one chunk in the prolog; it emits only one
`.cfi_def_cfa*` instruction followed by `.cfi_offset` ones after all
stack adjustments and register spills, and avoids generating CFI
instructions in the epilog(s) as well as any other exceeding CFI
instructions like `.cfi_remember_state` and `.cfi_restore_state`.
Effectively, it reverses the effects of D111411 and D114545 on functions
with the `uwtable(sync)` attribute. As a side effect, it also restores
the behavior on functions that have neither `uwtable` nor `nounwind`
attributes.

Differential Revision: https://reviews.llvm.org/D153098
2023-07-01 16:31:09 -07:00
Daniel Kiss
d75e70d7ae [AArch64] Add preserve_all calling convention.
Clang accepts preserve_all for AArch64 while it is missing form the backed.

Fixes #58145

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D135652
2023-04-28 14:55:38 +02:00
Kazu Hirata
1ca0cb717a [llvm] Replace None with std::nullopt in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-04-25 23:53:32 -07:00
Hsiangkai Wang
0847cc06a6 [NFC][AArch64] Use 'i' to encode the offset form of load/store.
STG, STZG, ST2G, STZ2G are the exceptions to append 'Offset' to name the
offset format of load/store instructions. All other load/store
instructions use 'i' as the appendix. If there is no special reason to
do so, we should make the naming consistent.

Differential Revision: https://reviews.llvm.org/D141819
2023-03-06 12:34:19 +00:00
Tim Northover
2002c82278 AArch64: count callee stack we use when estimating scavenging requirements. 2023-02-16 09:59:27 +00:00
Florian Mayer
a4ab294bc0 [MTE stack] fix incorrect offset for st2g
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D143544
2023-02-09 10:06:32 -08:00
Evgenii Stepanov
bd3ee371e9 Revert "[AArch64][v8.3A] Avoid inserting implicit landing pads (PACI*SP)"
Linux kernel sets SCTRL_EL1.BT0 and BT1 to 1 unconditionally, which
makes PACIASP equivalent to BTI C + PACIA LR,SP.

Use the shorter instruction sequence by default.

I'm not aware of anyone who needs the opposite. They are welcome to
revert to the current behavior under a subtarget feature or an
environment check.

This reverts commit 571c8c5263a79293aaadae07b11feb36726eaf53.

Differential Revision: https://reviews.llvm.org/D141978
2023-01-19 14:09:22 -08:00