Maybe not needed but to avoid conflicts with #117307
Without revert of this one, but reverting #117307, the
regenerated init-undef.mir became empty.
This reverts commit be15fd5085680cc5ed9ec4f4f2258b504cdd55db.
When enabling subreg liveness tracking for AArch64, this pass fails
because it tries to get the register class for the artificial subreg
`sub_32_hi` of a 64-bit GPR. It tries to create an INIT_UNDEF
instruction for the top 32-bits of the 64-bit GPR, which are not
directly addressable, so getSubRegisterClass() returns a nullptr,
crashing this pass.
It should instead just avoid trying to create the INIT_UNDEF
instruction.
InitUndef should also handle early-clobber / undef conflicts in inline
asm operands. Do this by iterating over all_defs() instead of defs().
The newly added ARM test was generating an "unpredictable STXP instruction,
status is also a source" error prior to this change.
Fixes https://github.com/llvm/llvm-project/issues/106380.
The InitUndef pass works around a register allocation issue, where undef
operands can be allocated to the same register as early-clobber result
operands. This may lead to ISA constraint violations, where certain
input and output registers are not allowed to overlap.
Originally this pass was implemented for RISCV, and then extended to ARM
in #77770. I've since removed the target-specific parts of the pass in
#106744 and #107885. This PR reduces the pass to use a single
requiresDisjointEarlyClobberAndUndef() target hook and enables it by
default. The hook is disabled for AMDGPU, because overlapping
early-clobber and undef operands are known to be safe for that target,
and we get significant codegen diffs otherwise.
The motivating case is the one in arm64-ldxr-stxr.ll, where we were
previously incorrectly allocating a stxp input and output to the same
register.
Multiple invocations of the pass could interfere with eachother,
preventing some undefs being initialised.
I found it very difficult to create a unit test for this due to it being
dependent on particular allocations of a previous function. However, the
bug can be observed here: https://godbolt.org/z/7xnMo41Gv with the
creation of the illegal instruction `vnsrl.wi v9, v8, 0`
InitUndef currently always computes DeadLaneDetector, but only actually
uses it if subreg liveness is enabled for the target. Make the
calculation optional to avoid an unnecessary compile-time impact for
targets that don't enable subreg liveness.
The InitUndef pass currently uses the getLargestSuperClass() hook (which
is only used by that pass) to chose the register to initialize. This was done
to reduce the number of undef init pseudos needed, e.g. so that the vrnov0
regclass would use the same pseudo as v0. After #106744 we use a single
generic pseudo, so this is no longer necessary.
The InitUndef pass currently uses target-specific pseudo instructions,
with one pseudo per register class.
Instead, add a generic pseudo instruction, which can be used by all
targets and register classes.
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.
This removes the uses of target flags to disable subreg liveness,
relying on the `-enable-subreg-liveness` flag instead. The
`-enable-subreg-liveness` flag has been changed to take precedence over
the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()`
has been changed to `MRI->subRegLivenessEnabled()` to make sure the
option properly applies.
When using Greedy Register Allocation, there are times where
early-clobber values are ignored, and assigned the same register. This
is illeagal behaviour for these intructions. To get around this, using
Pseudo instructions for early-clobber registers gives them a definition
and allows Greedy to assign them to a different register. This then
meets the ARM Architecture Reference Manual and matches the defined
behaviour.
This patch takes the existing RISC-V patch and makes it target
independent, then adds support for the ARM Architecture. Doing this will
ensure early-clobber restraints are followed when using the ARM
Architecture. Making the pass target independent will also open up
possibility that support other architectures can be added in the future.