395 Commits

Author SHA1 Message Date
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
David Green
42e7796920 [ARM] Add a comment about fixupImmediateBr updaing ImmBranches. NFC
To prevent people from modernizing the loop, add a comment that
fixupImmediateBr can append to ImmBranches.
2025-07-01 15:01:25 +01:00
Qinkun Bao
8943036ec3 Fix UAF in ARMConstantIslandPass.
Revoke the change in https://github.com/llvm/llvm-project/pull/146198
2025-06-29 19:45:38 +00:00
Qinkun Bao
2248cdfa74
[Arm] Fix UAF in ARMConstantIslandPass (#146232)
https://github.com/llvm/llvm-project/pull/146198 changes
```
    for (unsigned i = 0, e = ImmBranches.size(); i != e; ++i)
      BRChange |= fixupImmediateBr(ImmBranches[i]);
```
to
```
    for (ImmBranch &Br : ImmBranches)
      BRChange |= fixupImmediateBr(Br);
```
Unfortunately, they are not NFC and cause the buildbot error. e.g.,
https://lab.llvm.org/buildbot/#/builders/24/builds/9943
https://lab.llvm.org/buildbot/#/builders/169/builds/12570
Use make_early_inc_range to fix the issue
2025-06-29 00:20:29 -04:00
Kazu Hirata
094a7087b8
[Target] Use range-based for loops (NFC) (#146198) 2025-06-27 22:07:58 -07:00
Simon Tatham
1d5bf04030
[ARM] Remove unused class member in ARMConstantIslandPass (#141093)
The map variable `BlockJumpTableRefCount` was added in commit
f5f28d5b0ce76af8f6944774aa73bad9e328b020 to track whether a basic block
was the target of any jump table entries. This was used in the function
`fixupBTI` to insert and remove BTIs after jump tables had been
modified.

Commit 3b742242a53ed0c2a2e1b6bb2352cace43c22030 removed `fixupBTI` on
the grounds that the work was now being done elsewhere. That left
`BlockJumpTableRefCount` still being created, but now nothing is using
it. So we can garbage-collect that variable and all the code that
populates it.
2025-05-27 08:34:50 +01:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
Oliver Hunt
76ba29bfd8
[NFC] Address bit-field storage sizes to ensure ideal packing (#139825)
The MS bit-field packing ABI depends on the storage size of the type of
being placed in the bit-field. This PR addresses a number of cases in
llvm where the storage type has lead to suboptimal packing.
2025-05-16 00:02:58 -07:00
Pengxuan Zheng
36acaa0be5 Revert "[ARM][ConstantIslands] Correct MinNoSplitDisp calculation (#114590)"
This reverts commit e48916f615e0ad2b994b2b785d4fe1b8a98bc322.
2025-04-10 13:56:52 -07:00
pzhengqc
e48916f615
[ARM][ConstantIslands] Correct MinNoSplitDisp calculation (#114590)
MinNoSplitDisp was first introduced in D16890 to handle cases where the
ConstantIslands pass fails to converge in the presence of big basic
blocks. However, the computation of the variable seems to be wrong as it
currently computes the offset immediately following UserBB. In other
words, it represents the distance from the beginning of the function to
the end of UserBB. The distance from the beginning of the function does
not seem to be a good indicator of how big the basic block is unless the
basic block is close to the beginning of the function. I think
MinNoSplitDisp should compute the distance between UserOffset and the
end of UserBB instead.
2024-12-14 10:14:50 -08:00
Kazu Hirata
9571cc2b28
[ARM] Remove unused includes (NFC) (#115995)
Identified with misc-include-cleaner.
2024-11-12 23:15:21 -08:00
Alexis Engelke
d871b2e0d0
[CodeGen] Use optimized domtree for MachineFunction (#102107)
The dominator tree gained an optimization to use block numbers instead
of a DenseMap to store blocks. Given that machine basic blocks already
have numbers, expose these via appropriate GraphTraits. For debugging,
block number epochs are added to MachineFunction -- this greatly helps
in finding uses of block numbers after RenumberBlocks().

In a few cases where dominator trees are preserved across renumberings,
the dominator tree is updated to use the new numbers.
2024-08-06 13:46:19 +02:00
Kazu Hirata
515618e245 Revert "[Target] Use range-based for loops (NFC) (#98844)"
This reverts commit 3614f65a7ba9d925010e3316a1d93bcebc632178.

fixupImmediateBr seems to resize ImmBranches.
2024-07-15 20:39:49 -07:00
Kazu Hirata
3614f65a7b
[Target] Use range-based for loops (NFC) (#98844) 2024-07-15 17:23:11 -07:00
Kazu Hirata
5e22a53698
[Target] Use range-based for loops (NFC) (#98705) 2024-07-13 17:40:51 -07:00
Kazu Hirata
ddaa93b095
[llvm] Use std::make_unique (NFC) (#97165)
This patch is based on clang-tidy's modernize-make-unique but limited
to those cases where type names are mentioned twice like
std::unique_ptr<Type>(new Type()), which is a bit mouthful.
2024-06-29 11:50:41 -07:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
Nikita Popov
9378d74e56 Revert "[ARM][NFC] Use addLiveIns method instead of manually adding live-ins (#87560)"
This reverts commit 6e14583c53c8b1950e502a7fa282d7e00ad2df4a.

PR merged without review.
2024-05-27 08:30:35 +02:00
AtariDreams
6e14583c53
[ARM][NFC] Use addLiveIns method instead of manually adding live-ins (#87560)
Do this instead of reimplementing addLiveIns which does the exact same thing.
2024-05-26 19:29:29 -04:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
John Brawn
4fb0e0114f [ARM] Generate out-of-line jump tables for XO without 32-bit branch
When we only have a 16-bit pc-relative branch instruction we generate
a table of address for a jump table. Currently this is placed inline,
but this won't work with execute-only memory. In this case generate
the jump table out-of-line.

Differential Revision: https://reviews.llvm.org/D153774
2023-06-28 13:30:39 +01:00
Kazu Hirata
4241d890ae [Target] Use range-based for loops (NFC) 2023-04-15 14:14:56 -07:00
Jirui Wu
3b742242a5 [ARM] Remove a redundant function fixupBTI
Since the redundant BTI instructions emitted by jump tables are now
removed in the ARMBranchTargets pass, the fixupBTI function is not needed
in the ARMConstantIslandPass. Some related tests are removed as well.

The relevant patch that removes the redundant BTI instructions:
https://reviews.llvm.org/D144470

Differential Revision: https://reviews.llvm.org/D145048
2023-03-01 15:01:38 +00:00
Jirui Wu
bb0403ae2e [ARM] Remove redundant BTI instructions for table jumps
A BTI instruction was previously inserted at the beginning of each block
that has its address stored in a jump table. Jump tables only emit
indirect jumps in ARM or Thumb1 modes. However, PACBTI is not supported
in these modes. As a result, BTI instructions emitted by jump tables are
redundant. Removing redundant BTI instructions improves the code size
and prevents potential gadgets.

Differential Revision: https://reviews.llvm.org/D144470
2023-02-24 10:32:30 +00:00
Tim Northover
c4ce967e34 ARM: skip debug instructions when matching jump-table patterns.
When working out whether we can see a compressible jump-table pattern during
ConstantIslands, we were stopping when we saw a debug instruction. Instead it's
better to keep iterating backwards to the first real instruction.

https://reviews.llvm.org/D142019
2023-02-10 12:27:59 +00:00
Tim Northover
6e520fcf45 Revert "ARM: skip debug instructions when matching jump-table patterns."
This reverts commit ce4fcea59e1d5829b4355b6401d7265be23f617a.

I committed it accidentally.
2023-01-26 13:26:10 +00:00
Tim Northover
ce4fcea59e ARM: skip debug instructions when matching jump-table patterns.
When working out whether we can see a compressible jump-table pattern during
ConstantIslands, we were stopping when we saw a debug instruction. Instead it's
better to keep iterating backwards to the first real instruction.
2023-01-26 13:00:36 +00:00
Vitaly Buka
6c52736e02 Revert "[llvm] Use range-based for loops (NFC)"
range-based loop should not be used here, as
fixupImmediateBr push_backs into the container.

http://lab.llvm.org/buildbot/#/builders/168
http://lab.llvm.org/buildbot/#/builders/74
http://lab.llvm.org/buildbot/#/builders/5
http://lab.llvm.org/buildbot/#/builders/239
http://lab.llvm.org/buildbot/#/builders/237
http://lab.llvm.org/buildbot/#/builders/236

This reverts commit fedc59734a44ef7b62c5f389b0cdffd02264b2a9.
2022-09-04 15:28:53 -07:00
Kazu Hirata
fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Zongwei Lan
ad73ce318e [Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
2022-05-26 11:22:41 -07:00
Kazu Hirata
c5cf7d910e [ARM] Use range-based for loops (NFC) 2021-12-20 23:06:47 -08:00
Kazu Hirata
de90490060 Revert "[ARM] Use range-based for loops (NFC)"
This reverts commit 93d79cac2ede436e1e3e91b5aff702914cdfbca7.

This patch seems to break
llvm/test/CodeGen/ARM/constant-islands-cfg.mir under asan.
2021-12-20 10:51:36 -08:00
Kazu Hirata
93d79cac2e [ARM] Use range-based for loops (NFC) 2021-12-20 00:04:53 -08:00
Ties Stuij
f5f28d5b0c [ARM] Implement BTI placement pass for PACBTI-M
This patch implements a new MachineFunction in the ARM backend for
placing BTI instructions. It is similar to the existing AArch64
aarch64-branch-targets pass.

BTI instructions are inserted into basic blocks that:
- Have their address taken
- Are the entry block of a function, if the function has external
  linkage or has its address taken
- Are mentioned in jump tables
- Are exception/cleanup landing pads

Each BTI instructions is placed in the beginning of a BB after the
so-called meta instructions (e.g. exception handler labels).

Each outlining candidate and the outlined function need to be in agreement about
whether BTI placement is enabled or not. If branch target enforcement is
disabled for a function, the outliner should not covertly enable it by emitting
a call to an outlined function, which begins with BTI.

The cost mode of the outliner is adjusted to account for the extra BTI
instructions in the outlined function.

The ARM Constant Islands pass will maintain the count of the jump tables, which
reference a block. A `BTI` instruction is removed from a block only if the
reference count reaches zero.

PAC instructions in entry blocks are replaced with PACBTI instructions (tests
for this case will be added in a later patch because the compiler currently does
not generate PAC instructions).

The ARM Constant Island pass is adjusted to handle BTI
instructions correctly.

Functions with static linkage that don't have their address taken can
still be called indirectly by linker-generated veneers and thus their
entry points need be marked with BTI or PACBTI.

The changes are tested using "LLVM IR -> assembly" tests, jump tables
also have a MIR test. Unfortunately it is not possible add MIR tests
for exception handling and computed gotos because of MIR parser
limitations.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Mikhail Maltsev
- Momchil Velikov
- Ties Stuij

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D112426
2021-12-01 12:54:05 +00:00
Kazu Hirata
c73fc74ce0 [llvm] Use range-based for loops (NFC) 2021-11-28 10:04:54 -08:00
Kazu Hirata
d45cb1d7ea [llvm] Use range-based for loops (NFC) 2021-11-23 08:54:48 -08:00
David Green
bb2d23dcd4 [ARM] Improve detection of fallthough when aligning blocks
We align non-fallthrough branches under Cortex-M at O3 to lead to fewer
instruction fetches. This improves that for the block after a LE or
LETP. These blocks will still have terminating branches until the
LowOverheadLoops pass is run (as they are not handled by analyzeBranch,
the branch is not removed until later), so canFallThrough will return
false. These extra branches will eventually be removed, leaving a
fallthrough, so treat them as such and don't add unnecessary alignments.

Differential Revision: https://reviews.llvm.org/D107810
2021-09-27 11:21:21 +01:00
David Green
ab280cbaa3 [ARM] Ensure undef is propagated to CBZ/CBNZ flags
In some rare circumstances we can be using an undef register for a
compare. When folded into a CBZ/CBNZ the undef flags are lost, leading
to machine verifier problems. This propagates the existing flags to the
new instruction.
2021-03-03 08:02:58 +00:00
David Green
d6ba8ecb60 [ARM] Add handling of t2LDRSB/t2LDRSH in Constant Island Pass
These constant pool loads should be treated similarly to t2LDRB/t2LDRH,
acting on the same offset ranges. Add handling and a simple test.
2021-03-02 08:46:07 +00:00
Kazu Hirata
f890fd5f91 [llvm] Use llvm::is_sorted (NFC) 2021-01-27 23:25:39 -08:00
David Green
1454724215 [ARM] Align blocks that are not fallthough targets
If the previous block in a function does not fallthough, adding nop's to
align it will never be executed. This means we can freely (except for
codesize) align more branches. This happens in constantislandspass (as
it cannot happen later) and only happens at aggressive optimization
levels as it does increase codesize.

Differential Revision: https://reviews.llvm.org/D94394
2021-01-16 22:19:35 +00:00
Kazu Hirata
2082b10d10 [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
QingShan Zhang
2962f1149c [NFC] Add the getSizeInBytes() interface for MachineConstantPoolValue
Current implementation assumes that, each MachineConstantPoolValue takes
up sizeof(MachineConstantPoolValue::Ty) bytes. For PowerPC, we want to
lump all the constants with the same type as one MachineConstantPoolValue
to save the cost that calculate the TOC entry for each const. So, we need
to extend the MachineConstantPoolValue that break this assumption.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D89108
2021-01-05 03:22:45 +00:00
Kristof Beyls
320fd3314e [ARM] Implement harden-sls-retbr for Thumb mode
The only non-trivial consideration in this patch is that the formation
of TBB/TBH instructions, which is done in the constant island pass, does
not understand the speculation barriers inserted by the SLSHardening
pass. As such, when harden-sls-retbr is enabled for a function, the
formation of TBB/TBH instructions in the constant island pass is
disabled.

Differential Revision: https://reviews.llvm.org/D92396
2020-12-19 12:32:47 +00:00
Kristof Beyls
195f44278c [ARM] Implement harden-sls-retbr for ARM mode
Some processors may speculatively execute the instructions immediately
following indirect control flow, such as returns, indirect jumps and
indirect function calls.

To avoid a potential miss-speculatively executed gadget after these
instructions leaking secrets through side channels, this pass places a
speculation barrier immediately after every indirect control flow where
control flow doesn't return to the next instruction, such as returns and
indirect jumps, but not indirect function calls.

Hardening of indirect function calls will be done in a later,
independent patch.

This patch is implementing the same functionality as the AArch64 counter
part implemented in https://reviews.llvm.org/D81400.
For AArch64, returns and indirect jumps only occur on RET and BR
instructions and hence the function attribute to control the hardening
is called "harden-sls-retbr" there. On AArch32, there is a much wider
variety of instructions that can trigger an indirect unconditional
control flow change.  I've decided to stick with the name
"harden-sls-retbr" as introduced for the corresponding AArch64
mitigation.

This patch implements this for ARM mode. A future patch will extend this
to also support Thumb mode.

The inserted barriers are never on the correct, architectural execution
path, and therefore performance overhead of this is expected to be low.
To ensure these barriers are never on an architecturally executed path,
when the harden-sls-retbr function attribute is present, indirect
control flow is never conditionalized/predicated.

On targets that implement that Armv8.0-SB Speculation Barrier extension,
a single SB instruction is emitted that acts as a speculation barrier.
On other targets, a DSB SYS followed by a ISB is emitted to act as a
speculation barrier.

These speculation barriers are implemented as pseudo instructions to
avoid later passes to analyze them and potentially remove them.

The mitigation is off by default and can be enabled by the
harden-sls-retbr subtarget feature.

Differential Revision: https://reviews.llvm.org/D92395
2020-12-19 11:42:39 +00:00
Simon Wallis
4946802c5f [ARM] Fix so immediates and pc relative checks
Treating an SoImm offset as a multiple of 4 between -1020 and 1020
mis-handles the second of a pair of 16-bit constants where the offset is a multiple of 2 but not a multiple of 4,
leading to an LLVM ERROR: out of range pc-relative fixup value

For 32-bit and larger (64-bit) constants, continue to treat an SoImm offset as a multiple of 4 between -1020 and 1020.
For smaller (16-bit) constants, treat an SoImm offset as a multiple of 1 between -255 and 255.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86949
2020-09-14 08:52:59 +01:00
Simon Wallis
94e4e37d55 [Thumb] set code alignment for 16-bit load from constant pool
Summary:
[Thumb] set code alignment for 16-bit load from constant pool

LLVM miscompiles this code when compiling for a target with v8.2-A FP16 and the Thumb ISA at -O0:

extern void bar(__fp16 P5);

int main() {
  __fp16 P5 = 1.96875;
  bar(P5);
}

The code section containing main has 2 byte alignment.
It needs to have 4 byte alignment,
because the load literal instruction has an offset from the
load address with the low 2 bits zeroed.

I do not include a test case in this check-in.
llc and llvm-mc do not exhibit this bug. They do not set code section alignment
in the same manner as clang.

Reviewers: dnsampaio

Reviewed By: dnsampaio

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84169
2020-07-22 10:12:41 +01:00
James Y Knight
1978309db1 MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.

Differential Revision: https://reviews.llvm.org/D79605
2020-06-06 22:30:51 -04:00