12216 Commits

Author SHA1 Message Date
Florian Hahn
b1a5ee1feb
[ARM] Check all terms in emitPopInst when clearing Restored for LR. (#75527)
emitPopInst checks a single function exit MBB. If other paths also exit
the function and any of there terminators uses LR implicitly, it is not
save to clear the Restored bit.

Check all terminators for the function before clearing Restored.

This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll
where the machine-outliner previously introduced BLs that clobbered LR
which in turn is used by the tail call return.

Alternative to #73553
2023-12-20 16:56:15 +01:00
Serge Pavlov
2f81788067
[ARM][FPEnv] Lowering of fpmode intrinsics (#74054)
LLVM intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` operate
control modes, the bits of FP environment that affect FP operations. On
ARM these bits are in FPSCR together with the status bits. The
implementation of these intrinsics produces code close to that of
functions `fegetmode` and `fesetmode` from GLIBC.

Pull request: https://github.com/llvm/llvm-project/pull/74054
2023-12-18 18:57:36 +07:00
ostannard
4888218d03
[ARM] Do not emit unwind tables when saving LR around outlined call (#69611)
In some cases, the machine outliner needs to preserve LR across an
outlined call by pushing it onto the stack. Previously, this also
generated unwind table instructions, which is incorrect because EHABI
unwind tables cannot represent different stack frames a different points
in the function, so the extra unwind info applied to the entire
function.

The outliner code already avoided generating CFI instructions, but EHABI
unwind data is generated later from the actual instructions, so we need
to avoid using the FrameSetup and FrameDestroy flags to prevent unwind
data being generated.
2023-12-14 14:46:13 +00:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Jonathan Thackray
f576cbe44e
[AArch64] Correctly mark Neoverse N2 as an Armv9.0a core (#75055)
Neoverse N2 was incorrectly marked as an Armv8.5a core. This has been
changed to an Armv9.0a core. However, crypto options are not enabled
by default for Armv9 cores, so -mcpu=neoverse-n2+crypto is required
to enable crypto for this core.

Neoverse N2 Technical Reference Manual:
   https://developer.arm.com/documentation/102099/0003/
2023-12-11 18:52:25 +00:00
Kazu Hirata
d57a26a714 [ARM] Include bit.h instead of MathExtras.h (NFC) 2023-12-10 11:58:52 -08:00
Kazu Hirata
b85f1f9b18 [Target] Include bitset (NFC)
These files are relying on the transitive include of <bitset> from
GIMatchTableExecutor.h, which doesn't actually use std::bitset.
2023-12-09 18:34:57 -08:00
Jonathan Thackray
8758e648da
[ARM][AArch32] Add support for AArch32 Cortex-M52 CPU (#74822)
Cortex-M52 is an Armv8.1 AArch32 CPU.

Technical specifications available at:
  https://developer.arm.com/processors/cortex-m52
2023-12-08 15:04:08 +00:00
Kazu Hirata
286ef12b47 [Target] Remove unnecessary includes (NFC) 2023-12-07 21:03:56 -08:00
Craig Topper
e87f33d9ce
[RISCV][MC] Pass MCSubtargetInfo down to shouldForceRelocation and evaluateTargetFixup. (#73721)
Instead of using the STI stored in RISCVAsmBackend, try to get it from
the MCFragment.

This addresses the issue raised here
https://discourse.llvm.org/t/possible-problem-related-to-subtarget-usage/75283
2023-12-07 13:17:58 -08:00
Kazu Hirata
124b4ab85a [Target] Stop including bitset (NFC)
Identified with clangd.
2023-12-05 20:48:04 -08:00
Craig Topper
e888e83fb6
[ARM][AArch64] Use SelectionDAG::SplitScalar to simplify some code. (#74411)
We know we're splitting a type in half to two legal values. Instead of
using shift and truncate that need to be legalized, we can use two
ISD::EXTRACT_ELEMENTs.

Spotted while reviewing #67918 for RISC-V which copied this code.
2023-12-05 07:51:54 -08:00
Ramkumar Ramachandra
e8dbe945f3
TargetInstrInfo, TargetSchedule: fix non-NFC parts of 9468de4 (#74338)
Follow up on a post-commit review of 9468de4 (TargetInstrInfo: make
getOperandLatency return optional (NFC)) by Bjorn Pettersson to fix a
couple of things that are not NFC:

- std::optional<T>::operator<= returns true if the first operand is a
std::nullopt and second operand is T. Fix a couple of places where we
assumed it would return false.
- In TargetSchedule, computeInstrCost could take another codepath,
returning InstrLatency instead of DefaultDefLatency. Fix one instance
not accounting for this behavior.
2023-12-05 08:18:17 +00:00
Momchil Velikov
e3a97dffee
[Verifier] Check function attributes related to branch protection (NFC) (#70565) 2023-12-04 16:16:55 +00:00
Simon Pilgrim
83e01ea1a5 Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-04 11:11:33 +00:00
Simon Pilgrim
5c672d87ea Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-04 10:48:33 +00:00
Kazu Hirata
92c2529ccd [llvm] Stop including vector (NFC)
Identified with clangd.
2023-12-03 22:32:21 -08:00
Kazu Hirata
f6d6809d78 [llvm] Stop including array (NFC)
Identified with clangd.
2023-12-02 00:52:25 -08:00
Ramkumar Ramachandra
d222fa4521
TargetInstrInfo: squelch a signedness warning on MSVC (#74078)
Follow up on 9468de4 (TargetInstrInfo: make getOperandLatency return
optional (NFC)) to squelch a signedness warning on MSVC, reported by
Simon Pilgrim.
2023-12-01 16:08:41 +00:00
Eleanor Bonnici
6e3b2cb46e
[llvm][MC][ARM][Assembly] Emit relocations for ADRs and big-endian targets (#73834)
Follow-up on https://github.com/llvm/llvm-project/pull/72873/
    
When ADR/LDR instructions reference a label in a different section, the
offset is not known until link time, however, the assembler assumes it
    can resolve them in some cases.
    
    The previous patch addressed the issue for most LDR instructions,
    focusing on little-endian targets.
    
This patch addresses the remaining work for ADRs and big-endian targets.
2023-12-01 13:54:04 +00:00
Ramkumar Ramachandra
9468de48fc
TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769)
getOperandLatency has the following behavior: it returns -1 as a special
value, negative numbers other than -1 on some target-specific overrides,
or a valid non-negative latency. This behavior can be surprising, as
some callers do arithmetic on these negative values. Change the
interface of getOperandLatency to return a std::optional<unsigned> to
prevent surprises in callers. While at it, change the interface of
getInstrLatency to return unsigned instead of int.

This change was inspired by a refactoring in
TargetSchedModel::computeOperandLatency.
2023-12-01 11:29:19 +00:00
Eleanor Bonnici
bbc5d9fe42
[llvm][MC][ARM][Assembly] Emit relocations for LDRs (#72873)
It's possible (though inadvisable) to use LDR and refer to labels in
different
sections. In the Arm state, the assembler resolves the LDR instruction
without
emitting a relocation. That's incorrect because the assembler cannot
make any
assumptions about the relative position of the sections and the compiler
output
is therefore wrong.

This patch ensures relocations are generated for all `LDR <Rt...>,
label`
instructions in the Arm state (little endian). This is not necessary
when the
label is in the same section but the relocation is now generated
regardless.
Instructions that now generate relocations have been removed from the
pcrel-global.s test.

Fortunately, LLD already implements the generated relocations and can
fix LDR
instructions when the symbol is in a different section, or report an
error if
the offset is too large for the immediate field in the particular LDR's
encoding.

The patch to address this problem for big endian targets will follow, as
well
as a fix for ADR that exhibits a similar behavior.
2023-11-25 12:36:00 +00:00
Nikita Popov
c1e3a94105 [TargetLowering] Don't include ComplexDeinterleavingPass.h (NFC)
TargetLowering.h shouldn't include any passes and thus pull in
the entire pass infrastructure. Replace the include with forward
declarations.
2023-11-24 12:13:38 +01:00
simpal01
74cdb8e6f8
[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features (#71545)
The floating-point and MVE features together specify the MVE
functionality that is supported on the Cortex-M85 processor. But the FPU
extension for the underlying architecture(armv8.1-m.main) is FPV5 which
does not include MVE-F. So Compiler's -S output and `-save-temps=obj`
loses MVE feature which leads to assembler error. What happening here is
.fpu directive overrides any previously set features by .cpu directive.
Since the the corresponding .fpu generated (.fpu fpv5-d16) does not
include MVE-F, it overrides those features even though it is supported
and set by the .cpu directive. Looks like .fpu is supposed to do this.

In this case, there should be an .arch_extension directive re-enabling
the relevant extensions after .fpu if the goal is to keep these
extensions enabled. GCC also does the same.

So this patch enables the MVE features by emitting the below arch
extension:
  .fpu fpv5-d16
  .arch_extension mve.fp

---------

Co-authored-by: Simi Pallipurath <simi.pallipurath.com>
2023-11-22 09:16:58 +00:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Simon Pilgrim
cfee7152d4 [DAG] clang-format createBranchMacroFusionDAGMutation calls. NFC.
Reduces diff in #72227
2023-11-20 12:13:09 +00:00
Serge Pavlov
a2e1de1934 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-20 15:08:25 +07:00
Alex Bradbury
5b3eb1bc22
[ARM][X86][NFC] Use lambda to avoid duplicate switches in areLoadsFromSameBasePtr (#72376)
Both the Arm and X86 implementations of areLoadsFromSameBasePtr use a
switch over the machine opcode, and repeat the same logic for both
SDNode operands. We can avoid the duplicated logic (especially lengthy
in the X86 case) by just using a lambda. This could obviously be a
candidate for moving out to a separate helper function if there were
other users, but I've made the minimal change in this patch.
2023-11-15 12:35:35 +00:00
Kazu Hirata
01702c3f7f [llvm] Stop including llvm/ADT/SmallSet.h (NFC)
Identified with clangd.
2023-11-11 12:32:15 -08:00
Serge Pavlov
5b0f703918 Revert "[ARM][FPEnv] Lowering of fpenv intrinsics"
This reverts commit d62f040418bd167d1ddd2b79c640a90c0c2ea353.
Some cuda buildbots start failing.
2023-11-10 16:24:51 +07:00
Serge Pavlov
d62f040418 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-10 16:06:33 +07:00
Craig Topper
c4821073cd
[GISel] Make target's PartMapping, ValueMapping, and BankIDToCopyMapIdx arrays const. (#71079)
AMDGPU arrays were already const.
2023-11-09 17:03:56 -08:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
David Green
fee2953f23
[ARM] Fix for undef elements from demanded elements (#70504)
I think this is right, that the undef bits should be the undef bits from
the passthrough (operand 0), with the top/bottom lanes cleared, as they
come from the second arg (operand 1). We don't yet attempt to look for
undef elements in the second operand, but this should fix the bug with
all elements being marked as undef and the instruction being optimized
away.
2023-11-02 14:28:40 +00:00
Fangrui Song
5888dee7d0
[ARM,ELF] Fix access to dso_preemptable __stack_chk_guard with static relocation model (#70014)
The ELF code from https://reviews.llvm.org/D112811 emits LDRLIT_ga_pcrel
when `TM.isPositionIndependent()` but uses a different condition
`Subtarget.isGVIndirectSymbol(GV)` (aka dso_preemptable on ELF targets).
This would cause incorrect access for dso_preemptable
`__stack_chk_guard` with the static relocation model.

Regarding whether `__stack_chk_guard` gets the dso_local specifier,
https://reviews.llvm.org/D150841 switched to
`M.getDirectAccessExternalData()` (implied by "PIC Level") instead of
`TM.getRelocationModel() == Reloc::Static`.

The result is that when non-zero "PIC Level" is used with static
relocation model (e.g. -fPIE/-fPIC LTO compiles with -no-pie linking),
`__stack_chk_guard` accesses are incorrect.
```
        ldr     r0, .LCPI0_0
        ldr     r0, [r0]
        ldr     r0, [r0]   // incorrectly dereferences __stack_chk_guard
...
.LCPI0_0:
        .long   __stack_chk_guard
```

To fix this, for dso_preemptable `__stack_chk_guard`, emit a GOT PIC
code sequence like for -fpic using `LDRLIT_ga_pcrel`:
```
        ldr     r0, .LCPI0_0
.LPC0_0:
        add     r0, pc, r0
        ldr     r0, [r0]
        ldr     r0, [r0]
...
LCPI0_0:
.Ltmp0:
        .long   __stack_chk_guard(GOT_PREL)-((.LPC0_0+8)-.Ltmp0)
```

Technically, `LDRLIT_ga_abs` with `R_ARM_GOT_ABS` could be used, but
`R_ARM_GOT_ABS` does not have GNU or integrated assembler support.
(Note, `.LCPI0_0: .long __stack_chk_guard@GOT` produces an
`R_ARM_GOT_BREL`, which is not desired).

This patch fixes #6499 while not changing behavior for the following
configurations:
```
run arm.linux.nopic --target=arm-linux-gnueabi -fno-pic
run arm.linux.pie --target=arm-linux-gnueabi -fpie
run arm.linux.pic --target=arm-linux-gnueabi -fpic
run armv6.darwin.nopic --target=armv6-apple-darwin -fno-pic
run armv6.darwin.dynamicnopic --target=armv6-apple-darwin -mdynamic-no-pic
run armv6.darwin.pic --target=armv6-apple-darwin -fpic
run armv7.darwin.nopic --target=armv7-apple-darwin -mcpu=cortex-a8 -fno-pic
run armv7.darwin.dynamicnopic --target=armv7-apple-darwin -mcpu=cortex-a8 -mdynamic-no-pic
run armv7.darwin.pic --target=armv7-apple-darwin -mcpu=cortex-a8 -fpic
run arm64.darwin.pic --target=arm64-apple-darwin
```
2023-10-31 15:37:26 -07:00
David Green
75b3c3d267
[ARM] Disable UpperBound loop unrolling for MVE tail predicated loops. (#69709)
For MVE tail predicated loops, better code can be generated by keeping
the loop whole than to unroll to an upper bound, which requires the
expansion of active lane masks that can be difficult to generate good
code for. This patch disables UpperBound unrolling when we find a
active_lane_mask in the loop.
2023-10-31 09:51:30 +00:00
Fangrui Song
8e247b8f47 Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC 2023-10-27 00:30:41 -07:00
Craig Topper
2f4328e697
[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086)
This was previously passed by value. It used to be passed by non-const
reference, but it was changed to value in D110610. I'm not sure why.
2023-10-24 15:47:04 -07:00
Craig Topper
9f592cbc18
[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810)
Previously they were passed by non-const reference. No in tree target
modifies the values.

This makes it possible to call assignValueToAddress from
assignCustomValue without a const_cast. For example in this patch
https://github.com/llvm/llvm-project/pull/69138.
2023-10-24 09:58:22 -07:00
David Green
8a701024f3 [ARM] Lower i1 concat via MVETRUNC
The MVETRUNC operation can perform the same truncate of two vectors, without
requiring lane inserts/extracts from every vector lane. This moves the concat
i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra
stack space for less instructions.
2023-10-18 19:40:11 +01:00
David Green
c060757bcc [ARM] Correct v2i1 concat extract types.
For two v2i1 concat into a v4i1, we cannot extract each i64 element as an i32.
This casts to a v4i32 instead and extracts the correct vector lanes.
2023-10-18 13:40:38 +01:00
Kazu Hirata
9bcc094d37 [llvm] Use llvm::erase_if (NFC) 2023-10-12 22:59:25 -07:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
ostannard
b98b567c25
[ARM] Correctly handle .inst in IT and VPT blocks (#68902)
Advance the IT and VPT block state when parsing the .inst directive, so
that it is possible to use them to emit conditional instructions. If we
don't do this, then a later instruction inside or just after the block
will have a mis-matched condition, so be incorrectly reported as an
error.
2023-10-12 17:03:01 +01:00
Kazu Hirata
b8885926f8 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 22:54:51 -07:00
Kazu Hirata
a9d5056862 Use llvm::endianness (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
support::endianness with llvm::endianness.
2023-10-10 21:54:15 -07:00
Kazu Hirata
d7b18d5083 Use llvm::endianness{,::little,::native} (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
llvm::support::endianness with llvm::endianness.
2023-10-09 00:54:47 -07:00
Alexey Bataev
e22818d5c9 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-05 06:17:07 -07:00
Arthur Eubanks
07389535a7 Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit b186f1f68be11630355afb0c08b80374a6d31782.

Causes crashes, see https://reviews.llvm.org/D158449.
2023-10-04 14:37:16 -07:00
Alexey Bataev
b186f1f68b [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-04 07:53:30 -07:00