364 Commits

Author SHA1 Message Date
David Green
f7ab364f5e
[ARM] Add basic NPM support for LoadStoreOptimizer (#184139)
This is similar to #184090 for ARM, porting the LoadStoreOptimizer to
the new pass manager. The time there are both a pre-ra and post-ra
variant that are ported.
2026-03-08 19:28:27 +00:00
Jonathan Thackray
cf21ea9c99
[ARM] Fix more typos (NFC)
Fix more typos in the AArch64 codebase using the
https://github.com/crate-ci/typos Rust package.

commit-id:33a1bb8d

Reviewers: davemgreen

Reviewed By: davemgreen

Pull Request: https://github.com/llvm/llvm-project/pull/183087
2026-03-06 22:27:35 +00:00
David Green
e0fa4952fd [ARM] Format ARMLoadStoreOptimizer Pass classes. NFC
These will need modifications to support the NPM, pre-format the classes to
reduce the needed differences.
2026-03-02 14:21:12 +00:00
Matt Arsenault
55422e804b
CodeGen: Remove TRI argument from getRegClass (#158225)
TargetInstrInfo now directly holds a reference to TargetRegisterInfo
and does not need TRI passed in anywhere.
2025-11-10 15:43:55 -08:00
Matt Arsenault
7289f2cd0c
CodeGen: Remove MachineFunction argument from getRegClass (#158188)
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.
2025-09-12 19:22:02 +09:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
Kazu Hirata
e555ccaa4d
[llvm] Call *Map::erase directly (NFC) (#135545) 2025-04-13 12:04:40 -07:00
Craig Topper
2a48995a03 [ARM] Pass ArrayRef by value instead of const reference. NFC 2025-03-13 23:07:45 -07:00
Kazu Hirata
9571cc2b28
[ARM] Remove unused includes (NFC) (#115995)
Identified with misc-include-cleaner.
2024-11-12 23:15:21 -08:00
Kazu Hirata
eef6c0926e
[ARM] Avoid repeated hash lookups (NFC) (#111935) 2024-10-11 08:58:06 -07:00
Kazu Hirata
126ed16525 [ARM] Fix formatting (NFC)
I'm about to post a PR in this area.
2024-10-10 20:30:04 -07:00
Kazu Hirata
9e535743a4
[ARM] Avoid repeated hash lookups (NFC) (#109569) 2024-09-22 07:50:45 -07:00
Kazu Hirata
a5d89d5048
[Target] Use llvm::replace (NFC) (#105942) 2024-08-24 10:02:01 -07:00
Kazu Hirata
3e47f6ba4a Rapply "[Target] Use range-based for loops (NFC) (#98844)"
This iteration drops hunks where the loop body adds more elements.
2024-07-17 19:39:04 -07:00
Kazu Hirata
515618e245 Revert "[Target] Use range-based for loops (NFC) (#98844)"
This reverts commit 3614f65a7ba9d925010e3316a1d93bcebc632178.

fixupImmediateBr seems to resize ImmBranches.
2024-07-15 20:39:49 -07:00
Kazu Hirata
3614f65a7b
[Target] Use range-based for loops (NFC) (#98844) 2024-07-15 17:23:11 -07:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
AtariDreams
7c21495fee
Reapply "Convert many LivePhysRegs uses to LiveRegUnits" (#84338)
This only converts the instances where all that is needed is to change
the variable type name.

Basically, anything that involves a function that LiveRegUnits does not
directly have was skipped to play it safe.

Reverts
7a0e222a17
2024-03-08 19:05:00 +05:30
Jay Foad
7a0e222a17 Revert "Convert many LivePhysRegs uses to LiveRegUnits (#83905)"
This reverts commit 2a13422b8bcee449405e3ebff957b4020805f91c.

It was causing test failures on the expensive check builders.
2024-03-07 08:20:26 +00:00
AtariDreams
2a13422b8b
Convert many LivePhysRegs uses to LiveRegUnits (#83905) 2024-03-06 10:38:14 +05:30
ostannard
749384c08e
[ARM] Update IsRestored for LR based on all returns (#82745)
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one.
However, there is also code in ARMLoadStoreOptimizer which changes
return instructions, but it set IsRestored based on the one instruction
it changed, not the whole function.

The fix is to factor out the code added in #75527, and also call it from
ARMLoadStoreOptimizer if it made a change to return instructions.

Fixes #80287.
2024-02-26 12:23:25 +00:00
Kazu Hirata
af8d050286 [Target] Use range-based for loops (NFC) 2023-12-24 23:09:55 -08:00
Kazu Hirata
9bcc094d37 [llvm] Use llvm::erase_if (NFC) 2023-10-12 22:59:25 -07:00
Maurice Heumann
a1cdb323e2 [ARM] Adjust strd/ldrd codegen alignment requirements
In change https://reviews.llvm.org/D152790, it was discovered that the
alignment requirement calculation for LDRD/STRD codegen was suboptimal
and the calculation for volatile loads and stores was adjusted.

This change here adopts the calculation for the remaining non-volatile
occurances.

Recommitting after undefined behavior fix in D155093.

Differential Revision: https://reviews.llvm.org/D153800
2023-07-14 12:54:18 -07:00
David Spickett
ab3bb86d44 Revert "[ARM] Adjust strd/ldrd codegen alignment requirements"
This reverts commit 92a9c30c61da7f973d55cd84fade424159b9cac9.

This has caused a test failure in the 2nd stage of Linaro's
Arm 32 bit buildbots.

LLVM::simplified-template-names.s

            7: error: Simplified template DW_AT_name could not be reconstituted:
check:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            8:  original: f3<unsigned char, (unsigned char)'\x00'>
check:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            9:  reconstituted: f3<unsigned char, (unsigned char)'\x7f'>
check:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I suspect a load/store is slightly off.
2023-07-03 14:05:49 +00:00
Maurice Heumann
92a9c30c61 [ARM] Adjust strd/ldrd codegen alignment requirements
In change https://reviews.llvm.org/D152790, it was discovered that the
alignment requirement calculation for LDRD/STRD codegen was suboptimal
and the calculation for volatile loads and stores was adjusted.

This change here adopts the calculation for the remaining non-volatile
occurances.

Differential Revision: https://reviews.llvm.org/D153800
2023-07-02 14:25:25 -07:00
NAKAMURA Takumi
00f8bbf07d Fix a warning in D149762 [-Wunused-variable] 2023-05-04 12:11:05 +09:00
Shubham Sandeep Rastogi
ee58f49a78 Change if() continue; to an assert if a DBG_VALUE or DBG_VALUE_LIST returns a null DILocalVariable
A  DBG_VALUE or DBG_VALUE_LIST must always return a non-null
DILocalVariable, the ARMLoadStoreOptimizer code that move’s DBG_VALUE
and DBG_VALUE_LIST instructions if their corresponding loads have been
moved, currently just continues if it finds a DBG_VALUE or
DBG_VALUE_LIST with a null DILocalVariable, change that to an assert.

Differential revision: https://reviews.llvm.org/D149762
2023-05-03 14:19:20 -07:00
Shubham Sandeep Rastogi
a971bc38ce Move DBG_VALUE's that depend on loads to after a
load if the load is moved due to the pre register allocation ld/st
optimization pass

The issue here is that there can be a scenario where debug information
is lost because of the pre register allocation load store optimization
pass, where a load who's result describes the debug infomation for a
local variable gets moved below the load and that causes the debug
information for that load to get lost.

Example:

Before the Pre Register Allocation Load Store Pass
inst_a
%2 = ld ...
inst_b
DBG_VALUE %2, "x", ...
%3 = ld ...

After the Pass:
inst_a
inst_b
DBG_VALUE %2, "x", ...
%2 = ld ...
%3 = ld ...

The load has now been moved to after the DBG_VAL that uses its result
and the debug info for "x" has been lost. What we want is:

inst_a
inst_b
%2 = ld ...
DBG_VALUE %2, "x", ...
%3 = ld ...

Which is what this patch addresses

Differential Revision: https://reviews.llvm.org/D145168
2023-04-24 16:10:54 -07:00
Shubham Sandeep Rastogi
9bc5e8c87e Revert "Move DBG_VALUE's that depend on loads to after a"
This reverts commit 0aaf634152f25a805563d552e72d89e8202d84f2.

Reverted this because of build failure https://lab.llvm.org/buildbot#builders/245/builds/7035

/home/tcwg-buildbot/worker/clang-armv8-quick/llvm/llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1.ll:28:12: error: DWARF23: expected string not found in input
; DWARF23: DW_OP_lit13{{$}}
          ^
<stdin>:1:1: note: scanning from here
-: file format elf32-littlearm
^
<stdin>:19:20: note: possible intended match here
DW_AT_frame_base (DW_OP_reg13 SP)
                  ^
2023-04-12 12:45:53 -07:00
Shubham Sandeep Rastogi
0aaf634152 Move DBG_VALUE's that depend on loads to after a
load if the load is moved due to the pre register allocation ld/st
optimization pass

The issue here is that there can be a scenario where debug information
is lost because of the pre register allocation load store optimization
pass, where a load who's result describes the debug infomation for a
local variable gets moved below the load and that causes the debug
information for that load to get lost.

Example:

Before the Pre Register Allocation Load Store Pass
inst_a
%2 = ld ...
inst_b
DBG_VALUE %2, "x", ...
%3 = ld ...

After the Pass:
inst_a
inst_b
DBG_VALUE %2, "x", ...
%2 = ld ...
%3 = ld ...

The load has now been moved to after the DBG_VAL that uses its result
and the debug info for "x" has been lost. What we want is:

inst_a
inst_b
%2 = ld ...
DBG_VALUE %2, "x", ...
%3 = ld ...

Which is what this patch addresses

Differential Revision: https://reviews.llvm.org/D145168
2023-04-12 12:10:58 -07:00
David Green
76f60931e2 [ARM] Allow distributing postinc with PHI uses
Although this doesn't usually come up, we can have uses of the
BaseAccess of a distributed postinc being a PHI. This doesn't need the
usual dominance check as we will dominate along the phi edge, allowing
us to still create a postinc load/store.

Differential Revision: https://reviews.llvm.org/D127676
2022-06-20 10:08:21 +01:00
Zongwei Lan
ad73ce318e [Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
2022-05-26 11:22:41 -07:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
serge-sans-paille
ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Kazu Hirata
483499670e [Target] Use llvm::reverse (NFC) 2021-12-12 08:34:24 -08:00
Ties Stuij
63eb7ff47d [ARM] Implement PAC return address signing mechanism for PACBTI-M
This patch implements PAC return address signing for armv8-m. This patch roughly
accomplishes the following things:

- PAC and AUT instructions are generated.
- They're part of the stack frame setup, so that shrink-wrapping can move them
inwards to cover only part of a function
- The auth code generated by PAC is saved across subroutine calls so that AUT
can find it again to check
- PAC is emitted before stacking registers (so that the SP it signs is the one
on function entry).
- The new pseudo-register ra_auth_code is mentioned in the DWARF frame data
- With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT
validates the corresponding value of SP
- Emit correct unwind information when PAC is replaced by PACBTI
- Handle tail calls correctly

Some notes:

We make the assembler accept the `.save {ra_auth_code}` directive that is
emitted by the compiler when it saves a register that contains a
return address authentication code.

For EHABI we need to have the `FrameSetup` flag on the instruction and
handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit
`.save {ra_auth_code}`, instead of `.save {r12}`.

For PACBTI-M, the instruction which computes return address PAC should use SP
value before adjustment for the argument registers save are (used for variadic
functions and when a parameter is is split between stack and register), but at
the same it should be after the instruction that saves FPCXT when compiling a
CMSE entry function.

This patch moves the varargs SP adjustment after the FPCXT save (they are never
enabled at the same time), so in a following patch handling of the `PAC`
instruction can be placed between them.

Epilogue emission code adjusted in a similar manner.

PACBTI-M code generation should not emit any instructions for architectures
v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases
is handled separately by a future ticket.

note on tail calls:

If the called function has four arguments that occupy registers `r0`-`r3`, the
only option for holding the function pointer itself is `r12`, but this register
is used to keep the PAC during function/prologue epilogue and clobbers the
function pointer.

When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to
keep six values - the four function arguments, the function pointer and the PAC,
which is obviously impossible.

One option would be to authenticate the return address before all callee-saved
registers are restored, so we have a scratch register to temporarily keep the
value of `r12`. The issue with this approach is that it violates a fundamental
invariant that PAC is computed using CFA as a modifier. It would also mean using
separate instructions to pop `lr` and the rest of the callee-saved registers,
which would offset the advantages of doing a tail call.

Instead, this patch disables indirect tail calls when the called function take
four or more arguments and the return address sign and authentication is enabled
for the caller function, conservatively assuming the caller function would spill
LR.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Momchil Velikov
- Ties Stuij

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D112429
2021-12-07 10:15:19 +00:00
David Green
b8f1ccb0ac [ARM] Introduce i8neg and i8pos addressing modes
Some instructions with i8 immediate ranges can only hold negative values
(like t2LDRHi8), only hold positive values (like t2STRT) or hold +/-
depending on the U bit (like the pre/post inc instructions. e.g
t2LDRH_POST). This patch splits the AddrModeT2_i8 into AddrModeT2_i8,
AddrModeT2_i8pos and AddrModeT2_i8neg to make this clear.

This allows us to get the offset ranges of t2LDRHi8 correct in the
load/store optimizer, fixing issues where we could end up creating
instructions with positive offsets (which may then be encoded as ldrht).

Differential Revision: https://reviews.llvm.org/D114638
2021-12-02 17:10:26 +00:00
Kazu Hirata
387927bbaf [Target] Use range-based for loops (NFC) 2021-11-26 21:21:17 -08:00
Kazu Hirata
562356d6e3 [Target] Use range-based for loops (NFC) 2021-11-26 08:23:01 -08:00
Kazu Hirata
d45cb1d7ea [llvm] Use range-based for loops (NFC) 2021-11-23 08:54:48 -08:00
Kazu Hirata
84b07c9b3a [llvm] Use pop_back_val (NFC) 2021-09-19 13:44:23 -07:00
Kazu Hirata
c9fca53af1 [CodeGen, Target] Use pred_empty and succ_empty (NFC) 2021-09-10 11:11:31 -07:00
David Green
9cb8f4d1ad [ARM] Add a tail-predication loop predicate register
The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r3, #3
DLSTP lr, r3        // Start tail predication, lr==3
VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0.
mov lr, #1
VADD.s32 q0, q1, q2 // Only first lane is updated.

This means that the value of lr cannot be spilled and re-used in tail
predication regions without potentially altering the behaviour of the
program. More lanes than required could be stored, for example, and in
the case of a gather those lanes might not have been setup, leading to
alignment exceptions.

This patch adds a new lr predicate operand to MVE instructions in order
to keep a reference to the lr that they use as a tail predicate. It will
usually hold the zeroreg meaning not predicated, being set to the LR phi
value in the MVETPAndVPTOptimisationsPass. This will prevent it from
being spilled anywhere that it needs to be used.

A lot of tests needed updating.

Differential Revision: https://reviews.llvm.org/D107638
2021-09-02 13:42:58 +01:00
David Green
c140ff493e [ARM] Change a couple of instances of LiveRegs.contains to !LiveRegs.available
This changes a couple of calls to LiveRegs.contains to
!LiveRegs.available, one in Thumb1FrameLoweringInfo (which modifies a
test to look more correct to me, given r7 should be the frame pointer so
is not available), and another in the ARMLoadStoreOptimizer, that I
don't have a test for, it was just found by inspection.

Differential Revision: https://reviews.llvm.org/D107454
2021-08-10 09:53:26 +01:00
David Green
010f8e3057 [ARM] Ensure correct regclass in distributing postinc
The register class required for some MVE loads/stores is more
constrained than the register we use when creating postinc. Make sure we
constrain the register class to keep the code correct.
2021-07-26 14:26:38 +01:00
David Green
03892a27d6 [ARM] Expand the range of allowed post-incs in load/store optimizer
Currently the load/store optimizer will only fold in increments of the
same size as the load/store. This patch expands that to any legal
immediate for the post-inc instruction.

This is a recommit of 3b34b06fc5908b with correctness fixes and extra
tests.

Differential Revision: https://reviews.llvm.org/D95885
2021-02-24 08:46:15 +00:00