284 Commits

Author SHA1 Message Date
Daniil Kovalev
eb4ac6400b
Revert "[PAC][CodeGen][ELF][AArch64] Support signed GOT" (#102434)
Reverts llvm/llvm-project#96164

See buildbot failure
https://lab.llvm.org/buildbot/#/builders/153/builds/5329
2024-08-08 10:58:45 +03:00
Daniil Kovalev
9dae7fcc92
[PAC][CodeGen][ELF][AArch64] Support signed GOT (#96164)
Depends on #96158 and #96159

Support the following relocations and assembly operators:

- `R_AARCH64_AUTH_ADR_GOT_PAGE` (`:got_auth:` for `adrp`)
- `R_AARCH64_AUTH_LD64_GOT_LO12_NC` (`:got_auth_lo12:` for `ldr`)
- `R_AARCH64_AUTH_GOT_ADD_LO12_NC` (`:got_auth_lo12:` for `add`)

`LOADgotAUTH` pseudo-instruction is introduced which is later expanded
to actual instruction sequence like the following.

```
adrp x16, :got_auth:sym
add x16, x16, :got_auth_lo12:sym
ldr x0, [x16]
autia x0, x16
```

If a resign is requested, like below, `LOADgotPAC` pseudo is used, and
GOT load is lowered similarly to `LOADgotAUTH`.

```
@var = global i32 0
define ptr @resign_globalvar() {
  ret ptr ptrauth (ptr @var, i32 3, i64 43)
}
```

Both SelectionDAG and GlobalISel are suppported. For FastISel, we fall
back to SelectionDAG.

Tests starting with 'ptrauth-' have corresponding variants w/o this
prefix.

See also specification
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#appendix-signed-got
2024-08-07 11:53:00 +03:00
Alexis Engelke
da0e66e64c
[CodeGen][NFC] Add wrapper method for MBBMap (#101893)
This is a preparation for changing the data structure of MBBMap.
2024-08-04 18:34:26 +02:00
Ahmed Bougacha
b8721fa0af
[AArch64][PAC] Sign block addresses used in indirectbr. (#97647)
Enabled in clang using:
    -fptrauth-indirect-gotos

and at the IR level using function attribute:
    "ptrauth-indirect-gotos"

Signing uses IA and a per-function integer discriminator. The
discriminator isn't ABI-visible, and is currently:
    ptrauth_string_discriminator("<function_name> blockaddress")

A sufficiently sophisticated frontend could benefit from per-indirectbr
discrimination, which would need additional machinery, such as allowing
"ptrauth" bundles on indirectbr. For our purposes, the simple scheme
above is sufficient.

This approach doesn't support subtracting label addresses and using
the result as offsets, because each label address is signed.
Pointer arithmetic on signed pointers corrupts the signature bits,
and because label address expressions aren't typed beyond void*,
we can't do anything reliably intelligent on the arithmetic exprs.
Not signing addresses when used to form offsets would allow
easily hijacking control flow by overwriting the offset.

This diagnoses the basic cases (`&&lbl2 - &&lbl1`) in the frontend,
while we evaluate either alternative implementations (e.g., lowering
blockaddress to a bb number, and indirectbr to a checked jump-table),
or better diagnostics (both at the frontend level and on unencodable
IR constants).
2024-07-22 21:24:39 -07:00
Joseph Huber
615b7eeaa9 Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
2024-07-20 09:29:31 -05:00
NAKAMURA Takumi
740161a9b9 Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610
2024-07-20 12:36:57 +09:00
Joseph Huber
c05126bdfc
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.

This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
2024-07-16 06:22:09 -05:00
Nikita Popov
4169338e75
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
2024-06-28 14:30:47 +02:00
Farzon Lotfi
2f0308ed02
[arm64] Add tan intrinsic lowering (#94545)
This change is an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This PR is just for Tan.

Now that x86 tan backend landed:
https://github.com/llvm/llvm-project/pull/90503 we can add other
backends since the shared pieces are in tree now.

Changes:
- `llvm/include/llvm/Analysis/VecFuncs.def` - vectorization of tan for
arm64 backends.
- `llvm/lib/Target/AArch64/AArch64FastISel.cpp` - Add tan to the libcall
table
- `llvm/lib/Target/AArch64/AArch64ISelLowering.cpp` - Add tan expansion
for f128, f16, and vector\neon operations
- `llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp` define
`G_FTAN` as a legal arm64 instruction

resolves #94755
2024-06-07 09:42:06 -04:00
David Majnemer
9e759f3523 [AArch64] Fix fptoi/itofp for bf16
There were a number of issues that needed to be addressed:
- i64 to bf16 did not correctly round
- strict rounding needed to yield a chain
- fastisel did not have logic to bail on bf16
2024-03-06 06:17:39 +00:00
Fangrui Song
201572e34b
[AArch64] Implement -fno-plt for SelectionDAG/GlobalISel
Clang sets the nonlazybind attribute for certain ObjC features. The
AArch64 SelectionDAG implementation for non-intrinsic calls
(46e36f0953aabb5e5cd00ed8d296d60f9f71b424) is behind a cl option.

GCC implements -fno-plt for a few ELF targets. In Clang, -fno-plt also
sets the nonlazybind attribute. For SelectionDAG, make the cl option not
affect ELF so that non-intrinsic calls to a dso_preemptable function use
GOT. Adjust AArch64TargetLowering::LowerCall to handle intrinsic calls.

For FastISel, change `fastLowerCall` to bail out when a call is due to
-fno-plt.

For GlobalISel, handle non-intrinsic calls in CallLowering::lowerCall
and intrinsic calls in AArch64CallLowering::lowerCall (where the
target-independent CallLowering::lowerCall is not called).
The GlobalISel test in `call-rv-marker.ll` is therefore updated.

Note: the current -fno-plt -fpic implementation does not use GOT for a
preemptable function.

Link: #78275

Pull Request: https://github.com/llvm/llvm-project/pull/78890
2024-03-05 13:55:29 -08:00
Sander de Smalen
41427b0e8e
[AArch64] Disable FastISel/GlobalISel for ZT0 state (#82768)
For __arm_new("zt0") we need to have special setup code in the prologue.
For calls that don't preserve zt0, we need to emit code preserve ZT0
around the call.
This is only emitted by SelectionDAG ISel at the moment.
2024-02-28 10:42:16 +00:00
Nico Weber
184ca39529
[llvm] Move CodeGenTypes library to its own directory (#79444)
Finally addresses https://reviews.llvm.org/D148769#4311232 :)

No behavior change.
2024-01-25 12:01:31 -05:00
Eli Friedman
a6065f0fa5
Arm64EC entry/exit thunks, consolidated. (#79067)
This combines the previously posted patches with some additional work
I've done to more closely match MSVC output.

Most of the important logic here is implemented in
AArch64Arm64ECCallLowering. The purpose of the
AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for
other targets, and generate most of the Arm64EC-specific bits:
generating thunks, mangling symbols, generating aliases, and generating
the .hybmp$x table. This is all done late for a few reasons: to
consolidate the logic as much as possible, and to ensure the IR exposed
to optimization passes doesn't contain complex arm64ec-specific
constructs.

The other changes are supporting changes, to handle the new constructs
generated by that pass.

There's a global llvm.arm64ec.symbolmap representing the .hybmp$x
entries for the thunks. This gets handled directly by the AsmPrinter
because it needs symbol indexes that aren't available before that.

There are two new calling conventions used to represent calls to and
from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few
changes to handle the associated exception-handling info,
SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX.

I've intentionally left out handling for structs with small
non-power-of-two sizes, because that's easily separated out. The rest of
my current work is here. I squashed my current patches because they were
split in ways that didn't really make sense. Maybe I could split out
some bits, but it's hard to meaningfully test most of the parts
independently.

Thanks to @dpaoliello for extensive testing and suggestions.

(Originally posted as https://reviews.llvm.org/D157547 .)
2024-01-22 21:28:07 -08:00
Jannik Silvanus
7954c57124
[IR] Fix GEP offset computations for vector GEPs (#75448)
Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of
vector GEPs need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which
the (bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new test is fixed.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-01-04 10:08:21 +01:00
brendaso1
923f6ac018
[FastISel][AArch64] Compare Instruction Miscompilation Fix (#75993)
When shl is folded in compare instruction, a miscompilation occurs when
the CMP instruction is also sign-extended. For the following IR:

  %op3 = shl i8 %op2, 3
  %tmp3 = icmp eq i8 %tmp2, %op3

It used to generate

   cmp w8, w9, sxtb #3

which means sign extend w9, shift left by 3, and then compare with the
value in w8. However, the original intention of the IR would require
`%op2` to first shift left before extending the operands in the
comparison operation . Moreover, if sign extension is used instead of
zero extension, the sample test would miscompile. This PR creates a fix
for the issue, more specifically to not fold the left shift into the CMP
instruction, and to create a zero-extended value rather than a
sign-extended value.
2024-01-03 13:49:05 +08:00
Juergen Ributzka
6d1d7be133
Obsolete WebKit Calling Convention (#71567)
The WebKit Calling Convention was created specifically for the WebKit
FTL. FTL
doesn't use LLVM anymore and therefore this calling convention is
obsolete.

This commit removes the WebKit CC, its associated tests, and
documentation.
2023-11-09 09:08:41 -08:00
David Truby
cde9f9df79
[AArch64] Fix x18 being used by nest ptrs with MSVC ABI (#68008)
This patch fixes an issue where x18 is used for passing pointer
parameters with
the nest attribute on Windows on AArch64. The x18 register is reserved
in the
WoA ABI so can't be used for this purpose. This is fixed by introducing
a new
Win64PCS calling convention that differs from the standard AAPCS only by
not
using x18 for nest parameters.
2023-10-10 14:30:57 +01:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
David Green
13c2514df3 [AArch64] Disable GlobalISel/FastISel for more SME functions
The patch D136361 disabled GlobalISel and FastISel for some SME functions, as
the saving and restoring of SM is not yet handled. There were several tests
added for fp128 fadd, which will be expanded to a libcall, that only happened
to work by accident and did not handle other cases such as f32/f64 frem
libcalls.

This extends the cases where GlobalISel / FastISel is disabled for functions
with SME attributes, under the assumption that it is difficult to tell what
will become a libcall reliably, and so should fall back for all function until
GlobalISel and/or FastISel can handle them.

Differential Revision: https://reviews.llvm.org/D158490
2023-08-22 18:06:27 +01:00
Sergei Barannikov
01a7967447 [CodeGen] Replace CCState's getNextStackOffset with getStackSize (NFC)
The term "next stack offset" is misleading because the next argument is
not necessarily allocated at this offset due to alignment constrains.
It also does not make much sense when allocating arguments at negative
offsets (introduced in a follow-up patch), because the returned offset
would be past the end of the next argument.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D149566
2023-05-17 21:51:45 +03:00
NAKAMURA Takumi
c1221251fb Restore CodeGen/MachineValueType.h from Support
This is rework of;

  - rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.

Depends on D148767

Differential Revision: https://reviews.llvm.org/D149024
2023-05-03 00:13:20 +09:00
Alexis Engelke
ab21beaccc [AArch64][FastISel] Handle CRC32 intrinsics
With a similar reason as D148023; some applications make heavy use of
the CRC32 intrinsic (e.g., as part of a hash function) and therefore
benefit from avoiding frequent SelectionDAG fallbacks. In our
application, we get a 2% compile-time improvement.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D148917
2023-04-28 11:29:23 +02:00
Alexis Engelke
7751a91465 [AArch64][FastISel] Handle call with multiple return regs
The code closely follows the X86 back-end. Applications that make heavy
use of {i64, i64} returns to use two registers strongly benefit from the
reduced number of SelectionDAG fallbacks.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D148346
2023-04-27 11:59:33 +02:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Guillaume Chatelet
135f23d67b Deprecate MemIntrinsicBase::getDestAlignment() and MemTransferBase::getSourceAlignment()
Differential Revision: https://reviews.llvm.org/D141840
2023-01-16 14:22:03 +00:00
Leonard Chan
d629db535f Reland "[llvm] Teach FastISel for AArch64 about tagged globals"
This reverts commit aacf17aa0ca8a67efc0ad2d4cfd90e551b5d6a7f.

Fixed by using the right register class for the movk.
2022-12-05 23:27:36 +00:00
Leonard Chan
aacf17aa0c Revert "[llvm] Teach FastISel for AArch64 about tagged globals"
This reverts commit 7358c29a42714eb8d7d7bcdb58688d20430689e4.

This broke an upstream builder:
https://lab.llvm.org/buildbot/#/builders/16/builds/39356
2022-12-05 22:45:04 +00:00
Leonard Chan
7358c29a42 [llvm] Teach FastISel for AArch64 about tagged globals
This addresses https://github.com/llvm/llvm-project/issues/57750. For
some globals, the tag wasn't propagated correctly because the necessary
movk wasn't emitted sometimes.

Differential Revision: https://reviews.llvm.org/D138615
2022-12-05 22:16:55 +00:00
Sander de Smalen
c85bd250e5 Reland "[AArch64][SME] Disable GlobalISel/FastISel for SME functions."
It turns that the issue was unrelated to the code-changes, but only triggered
by one of the tests. The SMEABI pass incorrectly marked the CFG as preserved,
even though it modified the CFG.

This reverts commit 8bcf5df3043a906c7124b70b59eda925eddd7319.
2022-11-10 10:44:54 +00:00
Sander de Smalen
8bcf5df304 Revert "[AArch64][SME] Disable GlobalISel/FastISel for SME functions."
Reverting the patch due to a buildbot failure.

This reverts commit e1e260cc64bd900d5f3f88187c60cb02d3a805f5.
2022-11-09 21:14:25 +00:00
Sander de Smalen
e1e260cc64 [AArch64][SME] Disable GlobalISel/FastISel for SME functions.
This patch ensures that GlobalISel and FastISel fall back to regular DAG ISel when:
* A function requires streaming-mode to be enabled at the start/end of the function.
  This happens when the function has no streaming interface, but does have a streaming body.
* A function requires a lazy-save to be committed at the start of the function.
  This happens if the function has the `aarch64_pstate_za_new` attribute.
* A call to a function requires a change in streaming-mode.
* A call to a function requires a lazy-save buffer to be set up.

Patch by @CarolineConcatto

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D136361
2022-11-09 16:10:53 +00:00
Marco Elver
0ba8886af5 [FastISel] Propagate PCSections metadata to MachineInstr
Propagate PC sections metadata to MachineInstr when FastISel is doing
instruction selection.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130884
2022-09-07 11:36:01 +02:00
Sami Tolvanen
cff5bef948 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Relands 67504c95494ff05be2a613129110c9bcf17f6c13 with a fix for
32-bit builds.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 22:41:38 +00:00
Sami Tolvanen
a79060e275 Revert "KCFI sanitizer"
This reverts commit 67504c95494ff05be2a613129110c9bcf17f6c13 as using
PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.
2022-08-24 19:30:13 +00:00
Sami Tolvanen
67504c9549 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 18:52:42 +00:00
Kazu Hirata
109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Zongwei Lan
ad73ce318e [Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
2022-05-26 11:22:41 -07:00
David Spickett
c3b98194df Reland "[llvm][AArch64] Insert "bti j" after call to setjmp"
This reverts commit edb7ba714acba1d18a20d9f4986d2e38aee1d109.

This changes BLR_BTI to take variable_ops meaning that we can accept
a register or a label. The pattern still expects one argument so we'll
never get more than one. Then later we can check the type of the operand
to choose BL or BLR to emit.

(this is what BLR_RVMARKER does but I missed this detail of it first time around)

Also require NoSLSBLRMitigation which I missed in the first version.
2022-03-23 11:43:43 +00:00
David Spickett
edb7ba714a Revert "[llvm][AArch64] Insert "bti j" after call to setjmp"
This reverts commit eb5ecbbcbb6ce38e29237ab5d17156fcb2e96e74
due to failures on buildbots with expensive checks enabled.
2022-03-23 10:43:20 +00:00
David Spickett
eb5ecbbcbb [llvm][AArch64] Insert "bti j" after call to setjmp
Some implementations of setjmp will end with a br instead of a ret.
This means that the next instruction after a call to setjmp must be
a "bti j" (j for jump) to make this work when branch target identification
is enabled.

The BTI extension was added in armv8.5-a but the bti instruction is in the
hint space. This means we can emit it for any architecture version as long
as branch target enforcement flags are passed.

The starting point for the hint number is 32 then call adds 2, jump adds 4.
Hence "hint #36" for a "bti j" (and "hint #34" for the "bti c" you see
at the start of functions).

The existing Arm command line option -mno-bti-at-return-twice has been
applied to AArch64 as well.

Support is added to SelectionDAG Isel and GlobalIsel. FastIsel will
defer to SelectionDAG.

Based on the change done for M profile Arm in https://reviews.llvm.org/D112427

Fixes #48888

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D121707
2022-03-23 09:51:02 +00:00
Jim Lin
d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Simon Pilgrim
71e39e3f18 [ADT] Add APInt::isNegatedPowerOf2() helper
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....

Differential Revision: https://reviews.llvm.org/D111998
2021-10-19 14:38:21 +01:00
Kazu Hirata
c1e32b3fc0 [Target] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-02 12:06:29 -07:00
Kazu Hirata
f631173d80 [llvm] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-09-30 08:51:21 -07:00
Eli Friedman
6c04b7dd4f [AArch64] Optimize overflow checks for [s|u]mul.with.overflow.i32.
Saves one instruction for signed, uses a cheaper instruction for
unsigned.

Differential Revision: https://reviews.llvm.org/D105770
2021-07-12 15:30:42 -07:00
Tim Northover
ea0eec69f1 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
Fangrui Song
7b0756a51a [AArch64] Fix some coding standard issues related to namespace llvm
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions
2021-05-05 15:27:16 -07:00
Philip Reames
4824d876f0 Revert "Allow invokable sub-classes of IntrinsicInst"
This reverts commit d87b9b81ccb95217181ce75515c6c68bbb408ca4.

Post commit review raised concerns, reverting while discussion happens.
2021-04-20 15:38:38 -07:00