153 Commits

Author SHA1 Message Date
Kazu Hirata
599005686a
[llvm] Use *Set::insert_range (NFC) (#132325)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-20 22:24:06 -07:00
Alex Bradbury
dffbc030e7
[MachineCopyPropagation] Recognise and delete no-op moves produced after forwarded uses (#129889)
This change removes 189 static instances of no-op reg-reg moves (i.e.
where src == dest) across llvm-test-suite when compiled for RISC-V
rv64gc and with SPEC included.
2025-03-10 13:22:59 +00:00
Jinsong Ji
5d2e2847e0
MachineCopyPropagation: Do not remove copies preserved by regmask (#125868)
llvm/llvm-project@9e436c2daa tries to handle register masks and
sub-registers, it avoids clobbering RegUnit presreved by regmask. But it
then introduces invalid pointer issues.

We delete the copies without invalidate all the use in the CopyInfo, so
we dereferenced invalid pointers in next interation, causing asserts.

Fixes: #126107

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-02-10 12:26:33 -05:00
Kazu Hirata
1c497c4837
[CodeGen] Avoid repeated hash lookups (NFC) (#126343) 2025-02-08 00:48:01 -08:00
Akshat Oke
8fdd982668
[NewPM] MachineCopyPropagation: Remove dead ID (#125665)
Fix for #125202 (4313345f2eeeb1e2ea7127a056ec4e1aaaa7fefb)
2025-02-04 16:04:14 +05:30
Akshat Oke
4313345f2e
[CodeGen][NewPM] Port MachineCopyPropagation to NPM (#125202) 2025-02-04 15:45:03 +05:30
Kazu Hirata
bc1e699d9f
[CodeGen] Avoid repeated hash lookups (NFC) (#123557) 2025-01-20 10:13:08 -08:00
Oliver Stannard
9e436c2daa
[MachineCP] Correctly handle register masks and sub-registers (#122734)
When passing an instruction with a register mask, the machine copy
propagation pass was dropping the information about some copy
instructions which define a register which is preserved by the mask,
because that register overlaps a register which is partially clobbered
by it. This resulted in a miscompilation for AArch64, because this
caused a live copy to be considered dead.

The fix is to clobber register masks by finding the set of reg units
which is preserved by the mask, and clobbering all units not in that
set.

This is based on #122472, and fixes the compile time performance
regressions which were caused by that.
2025-01-16 09:39:27 +00:00
Nikita Popov
d6f7f2a5fa Revert "[MachineCP] Correctly handle register masks and sub-registers (#122472)"
This reverts commit e2a071ece58790f8dd4886e998033cab82e906fb.

This causes a large compile-time regression.
2025-01-13 14:33:35 +01:00
Oliver Stannard
e2a071ece5
[MachineCP] Correctly handle register masks and sub-registers (#122472)
When passing an instruction with a register mask, the machine copy
propagation pass was dropping the information about some copy
instructions which define a register which is preserved by the mask,
because that register overlaps a register which is partially clobbered
by it. This resulted in a miscompilation for AArch64, because this
caused a live copy to be considered dead.

The fix is to clobber register masks by finding the set of reg units
which is preserved by the mask, and clobbering all units not in that
set.
2025-01-13 09:55:08 +00:00
Vladimir Radosavljevic
401d123a1f
[MCP] Optimize copies when src is used during backward propagation (#111130)
Before this patch, redundant COPY couldn't be removed for the following
case:
```
  $R0 = OP ...
  ... // Read of %R0
  $R1 = COPY killed $R0
```
This patch adds support for tracking the users of the source register
during backward propagation, so that we can remove the redundant COPY in
the above case and optimize it to:
```
  $R1 = OP ...
  ... // Replace all uses of %R0 with $R1
```
2024-10-23 13:37:02 +02:00
Vladimir Radosavljevic
dabb0ddbd7
[MCP] Skip invalidating def constant regs during forward propagation (#111129)
Before this patch, redundant COPY couldn't be removed for the following
case:
```
  %reg1 = COPY %const-reg
  ... // There is a def of %const-reg
  %reg2 = COPY killed %reg1
```
where this can be optimized to:
```
  ... // There is a def of %const-reg
  %reg2 = COPY %const-reg
```

This patch allows for such optimization by not invalidating defined
constant registers. This is safe, as architectures like AArch64 and
RISCV replace a dead definition of a GPR with a zero constant register
for certain instructions.
2024-10-10 18:05:42 +04:00
Kazu Hirata
b9d85b1263
[CodeGen] Use DenseMap::operator[] (NFC) (#108489)
Once we modernize CopyInfo with default member initializations,

  Copies.insert({Unit, ...})

becomes equivalent to:

  Copies.try_emplace(Unit)

which we can simplify further down to Copies[Unit].
2024-09-13 10:04:33 -07:00
Piyou Chen
2def1c4458
[RISCV][MCP] Remove redundant move from tail duplication (#89865)
Tail duplication will generate the redundant move before return. It is
because the MachineCopyPropogation can't recognize COPY after post-RA
pseudoExpand.

This patch make MachineCopyPropogation recognize `%0 = ADDI %1, 0` as
COPY
2024-08-28 08:32:54 +08:00
Kai Luo
e274d5f6ac
[MCP] Use MCRegUnit as the key type of CopyTracker::Copies map. NFC. (#98277)
`CopyTracker` is in fact tracking at RegUnit level, not MCRegister.
2024-07-11 08:43:13 +08:00
David Green
b5db2e1969 [MCP] Remove unused TII argument. NFC
Last used in e35fbf5c04f4719db8ff7c7a993cbf96bb706903.
2024-05-30 15:01:02 +01:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Craig Topper
23d45e55ed
[MCP] Remove dead copies from basic blocks with successors. (#86973)
Previously we wouldn't remove dead copies from basic blocks with
successors. The comment said we didn't want to trust the live-in lists.
The comment is very old so I'm not sure if that's still a concern today.

This patch checks the live-in lists and removes copies from
MaybeDeadCopies if they are referenced by any live-ins in any
successors. We only do this if the tracksLiveness property is set. If
that property is not set, we retain the old behavior.
2024-03-28 14:43:49 -07:00
Craig Topper
f90813543b
[MCP] Use MachineInstr::all_defs instead of MachineInstr::defs in hasOverlappingMultipleDef. (#86889)
defs does not return the defs for inline assembly. We need to use
all_defs to find them.

Fixes #86880.
2024-03-28 08:37:19 -07:00
Kazu Hirata
8ed1291d96
[MachineCopyPropagation] Make a SmallVector larger (NFC) (#79106)
This patch makes a SmallVector slightly larger.  We encounter quite a
few instructions with 3 or 4 defs but very few beyond that on X86.

This saves 0.39% of heap allocations during the compilation of a large
preprocessed file, namely X86ISelLowering.cpp, for the X86 target.
2024-01-23 09:27:18 -08:00
Vettel
dc1fadef23
[MCP] Enhance MCP copy Instruction removal for special case(reapply) (#74239)
Machine Copy Propagation Pass may lose some opportunities to further
remove the redundant copy instructions during the ForwardCopyPropagateBlock
procedure. When we Clobber a "Def" register, we also need to remove the record 
from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics
of the ClobberRegister function.  This patch reapplies #70778 and addresses the corner 
case bug  #73512 specific to the AMDGPU backend. Additionally, it refines the criteria 
for removing empty records from the copy maps, thereby enhancing overall safety.

For more information, please see the C++ test case generated code in 
"vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
2023-12-26 16:22:42 +08:00
Bjorn Pettersson
30afb21547 Revert "[MCP] Enhance MCP copy Instruction removal for special case (#70778)"
This reverts commit cae46f6210293ba4d3568eb21b935d438934290d.

Reverted due to miscompiles.
See https://github.com/llvm/llvm-project/issues/73512
2023-11-27 19:39:40 +01:00
Vettel
cae46f6210
[MCP] Enhance MCP copy Instruction removal for special case (#70778)
Machine Copy Propagation Pass may lose some opportunities to further
remove the redundant copy instructions during the ForwardCopyPropagateBlock
procedure. When we Clobber a "Def" register, we also need to remove the record 
from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics
of the ClobberRegister function.

For more information, please see the C++ test case generated code in 
"vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
2023-11-22 23:57:42 +08:00
Kazu Hirata
ce8c22856e Use llvm::drop_begin and llvm::drop_end (NFC) 2023-09-22 17:29:10 -07:00
Jeffrey Byrnes
f76ffc1f40 [MCP] Invalidate copy for super register in copy source
We must also track the super sources of a copy, otherwise we introduce a sort of subtle bug.

Consider:

1.  DEF r0:r1
2.  USE r1
3.  r6:r9 = COPY r10:r13
4.  r14:15 = COPY r0:r1
5.  USE r6
6.. r1:4 = COPY r6:9

BackwardCopyPropagateBlock processes the instructions from bottom up. After processing 6., we will have propagatable copy for r1-r4 and r6-r9. After 5., we invalidate and erase the propagatble copy for r1-r4 and r6 but not for r7-r9.

The issue is that when processing 3., data structures still say we have valid copies for dest regs r7-r9 (from 6.). The corresponding defs for these registers in 6. are r1:r4, which we mark as registers to invalidate. When invalidating, we find the copy that corresponds to r1 is 4. (this was added when processing 4.), and we say that r1 now maps to unpropagatable copies. Thus, when we process 2., we do not have a valid copy, but when we process 1. we do -- because the mapped copy for subregister r0 was never invalidated.

The net result is to propagate the copy from 4. to 1., and replace DEF r0:r1 with DEF r14:r15. Then, we have a use before def in 2.

The main issue is that we have an inconsitent state between which def regs and which src regs are valid. When processing 5., we mark all the defs in 6. as invalid, but only the subreg use as invalid. Either we must only invalidate the individual subreg for both uses and defs, or the super register for both.

Differential Revision: https://reviews.llvm.org//D157564

Change-Id: I99d5e0b1a0d735e8ea3bd7d137b6464690aa9486
2023-08-11 09:01:18 -07:00
pvanhout
c59f9eada1 [MCP] Optimize copies from undef
Revert D152502 and instead optimize away copy from undefs, but clear the undef flag on the original copy.
Apparently, not optimizing the COPY can cause performance issues in some cases.

Fixes SWDEV-405813, SWDEV-405899

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153838
2023-06-29 15:12:27 +02:00
Jay Foad
da7892f729 [MC] Use regunits instead of MCRegUnitIterator. NFC.
Differential Revision: https://reviews.llvm.org/D153122
2023-06-16 12:21:32 +01:00
Sergei Barannikov
aa2d0fbc30 [MC] Add MCRegisterInfo::regunits for iteration over register units
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D152098
2023-06-16 05:39:50 +03:00
pvanhout
df1782c2a2 [MCP] Do not remove redundant copy for COPY from undef
I don't think we can safely remove the second COPY as redundant in such cases.
The first COPY (which has undef src) may be lowered to a KILL instruction instead, resulting in no COPY being emitted at all.

Testcase is X86 so it's in the same place as other testcases for this function, but this was initially spotted on AMDGPU with the following:
```
 renamable $vgpr24 = PRED_COPY undef renamable $vgpr25, implicit $exec
 renamable $vgpr24 = PRED_COPY killed renamable $vgpr25, implicit $exec
```
The second COPY waas removed as redundant, and the first one was lowered to a KILL (= removed too), causing $vgpr24 to not have $vgpr25's value.

Fixes SWDEV-401507

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152502
2023-06-09 14:23:57 +02:00
Akshay Khadse
43b38696aa Fix uninitialized class members
Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148692
2023-04-20 11:18:34 +08:00
Akshay Khadse
8bf7f86d79 Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303
2023-04-17 16:32:46 +08:00
Sergei Barannikov
1f5e9a3502 [MCP] Do not try forward non-existent sub-register of a copy
In this example:
```
$d14 = COPY killed $d18
$s0 = MI $s28
```

$s28 is a sub-register of $d14. However, $d18 does not have
sub-registers and thus cannot be forwarded. Previously, this resulted
in $noreg being substituted in place of the use of $s28, which later
led to an assertion failure.

Fixes https://github.com/llvm/llvm-project/issues/60908, a regression
that was introduced in D141747.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D146930
2023-03-30 06:11:00 +03:00
Craig Topper
e35fbf5c04 [MachineCopyPropagation] Pass DestSourcePair to isBackwardPropagatableCopy. NFC
Instead of calling isCopyInstr again, just pass the DestSourcePair
from the isCopyInstr call from the caller.
2023-03-25 17:20:52 -07:00
Kai Luo
96aaebd12e [MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain
Remove spill-reload like copy chains. For example
```
r0 = COPY r1
r1 = COPY r2
r2 = COPY r3
r3 = COPY r4
<def-use r4>
r4 = COPY r3
r3 = COPY r2
r2 = COPY r1
r1 = COPY r0
```
will be folded into
```
r0 = COPY r1
r1 = COPY r4
<def-use r4>
r4 = COPY r1
r1 = COPY r0
```

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D122118
2023-02-08 03:34:25 +00:00
Owen Anderson
1a0ec9140c Resolve a FIXME in MachineCopyPropagation by allowig propagation to subregister uses.
Reviewed By: barannikov88

Differential Revision: https://reviews.llvm.org/D141747
2023-01-25 23:11:46 -06:00
Fangrui Song
b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Kazu Hirata
998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
Han-Kuan Chen
e29133629b [MachineCopyPropagation][RISCV] Fix D125335 accidentally change control flow.
D125335 makes regsOverlap skip following control flow, which is not entended
in the original code.

Differential Revision: https://reviews.llvm.org/D128039
2022-06-17 21:40:08 -07:00
Adrian Tong
7c13ae6490 Give option to use isCopyInstr to determine which MI is
treated as Copy instruction in MCP.

This is then used in AArch64 to remove copy instructions after taildup
ran in machine block placement

Differential Revision: https://reviews.llvm.org/D125335
2022-05-26 18:43:16 +00:00
Jay Foad
1bb3a9c642 [MachineCopyPropagation] More robust isForwardableRegClassCopy
Change the implementation of isForwardableRegClassCopy so that it
does not rely on getMinimalPhysRegClass. Instead, iterate over all
classes looking for any that satisfy a required property.

NFCI on current upstream targets, but this copes better with
downstream AMDGPU changes where some new smaller classes have been
introduced, which was breaking regclass equality tests in the old
code like:
    if (UseDstRC != CrossCopyRC && CopyDstRC == CrossCopyRC)

Differential Revision: https://reviews.llvm.org/D121903
2022-03-21 16:41:01 +00:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Kazu Hirata
ca2f53897a [CodeGen] Use range-based for loops (NFC) 2021-12-04 08:48:05 -08:00
Kazu Hirata
1a605f395f [CodeGen] Use make_early_inc_range (NFC) 2021-10-31 07:57:36 -07:00
Carl Ritson
b5d6ad20e1 [MachineCopyPropagation] Handle propagation of undef copies
When propagating undefined copies the undef flag must also be
propagated.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D111219
2021-10-07 20:34:27 +09:00
Vang Thao
549f6a819a [MachineCopyPropagation] Check CrossCopyRegClass for cross-class copys
On some AMDGPU subtargets, copying to and from AGPR registers using another
AGPR register is not possible. A intermediate VGPR register is needed for AGPR
to AGPR copy. This is an issue when machine copy propagation forwards a
COPY $agpr, replacing a COPY $vgpr which results in $agpr = COPY $agpr. It is
removing a cross class copy that may have been optimized by previous passes and
potentially creating an unoptimized cross class copy later on.

To avoid this issue, check CrossCopyRegClass if a different register class will
be needed for the copy. If so then avoid forwarding the copy when the
destination does not match the desired register class and if the original copy
already matches the desired register class.

Issue seen while attempting to optimize another AGPR to AGPR issue:

Live-ins: $agpr0
$vgpr0 = COPY $agpr0
$agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0
$agpr2 = COPY $vgpr0
$agpr3 = COPY $vgpr0
$agpr4 = COPY $vgpr0

After machine-cp:

$vgpr0 = COPY $agpr0
$agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0
$agpr2 = COPY $agpr0
$agpr3 = COPY $agpr0
$agpr4 = COPY $agpr0

Machine-cp propagated COPY $agpr0 to replace $vgpr0 creating 3 AGPR to AGPR
copys. Later this creates a cross-register copy from AGPR->VGPR->AGPR for each
copy when the prior VGPR->AGPR copy was already optimal.

Reviewed By: lkail, rampitec

Differential Revision: https://reviews.llvm.org/D108011
2021-08-24 21:22:36 -07:00
Alexandru Octavian Butiu
e90c6f5596 [MachineCopyPropagation] Fix differences in code gen when compiling with -g
Fixes bugs [[ https://bugs.llvm.org/show_bug.cgi?id=50580 | 50580 ]] and [[ https://bugs.llvm.org/show_bug.cgi?id=49446 | 49446  ]]

When compiling with -g "DBG_VALUE <reg>"  instructions are added in the MIR, if such a instruction is inserted between instructions that use <reg> then MachineCopyPropagation invalidates <reg> , this causes some copies  to not be propagated and causes differences in code generation (ex bugs 50580 and 49446 ).  DBG_VALUE instructions should be ignored  since they don't actually modify the register.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D104394
2021-07-02 19:27:06 +08:00
Stephen Tozer
fdb055f4f1 Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"
Previous crashes caused by this patch were the result of machine
subregisters being incorrectly handled in updateDbgUsersToReg; this has
been fixed by using RegUnits to determine overlapping registers, instead
of using the register values directly.

Differential Revision: https://reviews.llvm.org/D101523

This reverts commit 7ca26c5fa2df253878cab22e1e2f0d6f1b481218.
2021-05-12 10:19:57 +01:00
Arthur Eubanks
7ca26c5fa2 Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"
This reverts commit 0791f968fee259e5c34523167bd58179b8b081c2.

Causing crashes: https://crbug.com/1206764
2021-05-07 12:05:16 -07:00