Reversed copy is the following chain from FromReg to ToReg:
%Tmp1 = copy %Tmp2;
%FromReg = copy %Tmp1;
%ToReg = add %FromReg ...
%Tmp2 = copy %ToReg;
TwoAddressInstruction pass tries to preserve or create (by commuting)
such chains. Otherwise, there will be unavoidable copy.
This patch considers operations with tied operand a copy-like ones (in
terms of reversed copy chain), e.g.:
%Tmp1 = copy %Tmp2;
%Tmp1 = fma %Tmp1 (tied-def 0), ...
%FromReg = copy %Tmp1;
%ToReg = add %FromReg ...
%Tmp2 = copy %ToReg;
This helps at least for RISC-V RVV where two forms of FMA exists:
1. vfmadd.vv vd, vs1, vs2, where vd is a dst and src multiplicand: vd[i]
= +(vs1[i] * vd[i]) + vs2[i]
2. vfmacc.vv vd, vs1, vs2, where vd is a dst and src addend: vd[i] =
+(vs1[i] * vs2[i]) + vd[i]
For the example above, vfmacc is a more preferable form of FMA than
vfmadd. However, without this patch twoaddress instruction pass does not
see the difference between these two instructions and might commute
vfmacc to vfmadd for the example above. This will lead to the extra
copy.
The patch also changes the default value of dataflow-edge-limit. This
option controls the depth of reverved copy search algorithm. With the
introduction of tied operands analysis I consider such a change
reasonable since the old default value = 3 effectively stops the
analysis earlier than any chain is found.
Currently, the compiler doesn't create a COPY for undef operands while
lowering REG_SEQUENCE, and only if LIS information is available, it
propagates the undef flag to the subreg uses. So, if LIS isn't
available, we can end up with some uses without def of those lanes.
Now, we check which lanes are used in a single scan of
use_nodbg_operands() per REG_SEQ, and perform the skip of the COPY only
if LIS is avaible (as undef will be propagated later) or if there are no
uses for that lane.
There is still a scan of the use list, but now it's only one per REG_SEQ
and I think it's necessary, as there is no guarantee to have LIS or
other analysis pass information at this stage.
This is a proposal fix for issue:
https://github.com/llvm/llvm-project/issues/175596
---------
Co-authored-by: Carl Ritson <critson@perlfu.co.uk>
This Change makes `RegState` into an enum class, with bitwise operators.
It also:
- Updates declarations of flag variables/arguments/returns from
`unsigned` to `RegState`.
- Updates empty RegState initializers from 0 to `{}`.
If this is causing problems in downstream code:
- Adopt the `RegState getXXXRegState(bool)` functions instead of using a
ternary operator such as `bool ? RegState::XXX : 0`.
- Adopt the `bool hasRegState(RegState, RegState)` function instead of
using a bitwise check of the flags.
- Remove pass initialization calls from pass constructors.
- For some passes, add the initialization to `initializeCodeGen` or
`initializeGlobalISel`.
- Remove redundant initializations from llc and X86 target for some
passes.
In the New PM, `SlotIndexesAnalysis` should only be preserved when
`LiveIntervals` was cached and available, as `SlotIndexes` are only
maintained when `LiveIntervals` analysis is available.
This fixes potential stale `SlotIndexes` issues when running with NPM
where `LiveIntervals` analysis wasn't requested by prior passes.
When `eliminateRegSequence()` is called, the pass modifies the
`MachineFunction` but `MadeChange` was not being set to true.
This causes the pass to incorrectly return `PreservedAnalyses::all()`
even though changes were made.
This is in preparation for future changes in AMDGPU that will make more
substantial use of bundles pre-RA. For now, simply test this with
degenerate (single-instruction) bundles.
If the instruction with tied operands is a BUNDLE instruction and we
handle it by replacing an operand, then we need to update the
corresponding internal operands as well. Otherwise, the resulting MIR is
invalid.
The test case is degenerate in the sense that the bundle only contains a
single instruction, but it is sufficient to exercise this issue.
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.
This would be the last analysis required by `PHIElimination`.
Fixes#82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
If part of a register (lowered from REG_SEQUENCE) is undefined then we
should propagate undef flags to uses of those lanes. This is only
performed when live intervals are present as it requires live intervals
to correctly match uses to defs, and the primary goal is to allow
precise computation of subrange intervals.
Force live interval recomputation for a register if its definition is
narrowed to become partial. The live interval repair process cannot
otherwise detect these changes.
C++20 comes with std::erase to erase a value from std::vector. This
patch renames llvm::erase_value to llvm::erase for consistency with
C++20.
We could make llvm::erase more similar to std::erase by having it
return the number of elements removed, but I'm not doing that for now
because nobody seems to care about that in our code base.
Since there are only 50 occurrences of erase_value in our code base,
this patch replaces all of them with llvm::erase and deprecates
llvm::erase_value.
Teach the LiveIntervals path in isPlainlyKilled to handle physical
registers, to get equivalent functionality with the LiveVariables path.
Test this by adding -early-live-intervals RUN lines to a handful of
tests that would fail without this.
Calling isPlainlyKilled instead of directly checking for a kill flag
should make processTiedPairs behave the same with LiveIntervals
(i.e. when compiling with -early-live-intervals) as it does with
LiveVariables.
Update LiveIntervals after rewriting:
%reg = INSERT_SUBREG undef %reg, %subreg, subidx
to:
undef %reg:subidx = COPY %subreg
D113044 implemented this for the non-undef case.
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148303
Define and use a MachineOperand overload of isPlainlyKilled. This
improves codegen in a couple of tests because it catches the case where
MO does not kill Reg but another operand of the same instruction does.
Differential Revision: https://reviews.llvm.org/D147167
This is a helper function to very slightly simplify many calls to
MachineInstruction::getOperandNo.
Differential Revision: https://reviews.llvm.org/D143250
This transformation could've triggered a verifier assert if RegA and RegB
were of different reg classes. Fix this by constraining as the comment
for replaceRegWith suggests.
Differential Revision: https://reviews.llvm.org/D140672
D129213 improves verification of LiveVariables, and caused
CodeGen/X86/statepoint-cmp-sunk-past-statepoint.ll to fail with:
*** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
after Two-Address instruction pass.
Fix it by clearing AliveBlocks for a register which is no longer used.
Differential Revision: https://reviews.llvm.org/D136445
CodeGenPrepare pass can sink pointer comparison across statepoint
to the point of use (see comment in IR/SafepointIRVerifier.cpp)
Due to specifics of statepoints, it is still legal to have tied
def and use rewritten to the same register in TwoAddress pass.
However, properly updating LiveIntervals and LiveVariables becomes
complicated. For simplicity, let's fall back to generic handling of
tied registers when we detect such case.
TODO: This fixes functional (assertion) failure. Ideally we should
try to recompute new live range/liveness in place.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D132255
D124631 added special processing for STATEPOINT instructions.
It appears that assertion added there is too strong. We can get two
tied operands with the same register tied to different defs. If we
hit such case, do not process it in statepoint-specific code and
delegate it to common case.
STATEPOINT is a special pseudo instruction which represent Moving GC semantic to LLVM.
Every tied def/use VReg pair in STATEPOINT represent same physical register which can
'magically' change during call wrapped by statepoint.
(By construction, tied use operand is not live across STATEPOINT).
This means that when converting into two-address form, there is not need to insert COPY
instruction before stateppoint, what TwoAddressInstruction pass does for 'regular'
instructions.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D124631
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
Currently we create register mappings for registers used only once in current
MBB. For registers with multiple uses, when all the uses are in the current MBB,
we can also create mappings for them similarly according to the last use.
For example
%reg101 = ...
= ... reg101
%reg103 = ADD %reg101, %reg102
We can create mapping between %reg101 and %reg103.
Differential Revision: https://reviews.llvm.org/D113193