This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Restrict the 32-bit form of an instruction of integer as too many test cases
will be clobber as the register number updated.
From %reg = INSERT_SUBREG %reg, %subreg, subidx
To %reg:subidx = SUBREG_TO_REG 0, %subreg, subidx
Try to prefix the redundant mov instruction at D132325 as the SUBREG_TO_REG should not generate code.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D132939
The ToBeRemoved is used to remove any MachineInstructions that are no
longer needed, making sure we don't invalidate the iterator that is
currently in use by erasing the instruction straight away. This makes
issues for keeping the code in SSA from though, where subsequent
transforms that require SSA form may have been broken by previous
peepholes.
If, instead, we use make_early_inc_range the iteration issue shouldn't
be present, so long as we do not remove the subsequent instruction in
the peephole optimizations. That way the code between transforms is kept
in SSA form, meaning hopefully less things that can go wrong.
Differential Revision: https://reviews.llvm.org/D127296
isDef32 would attempt to make a guess at which SelectionDag nodes were
32bit sources, and use the nature of 32bit AArch64 instructions
implicitly zeroing the upper register half to not emit zext that were
expected to already be zero. This was a bit fragile though, needing to
guess at the correct opcodes that do not become 32bit defs later in
ISel.
This patch removed isDef32, relying on the AArch64MIPeephole optimizer
to remove redundant SUBREG_TO_REG nodes. A part of
SelectArithExtendedRegister was left with the same logic as a heuristic
to prevent some regressions from it picking less optimal sequences.
The AArch64MIPeepholeOpt pass also needs to be taught that a COPY from a
FPR will become a FMOVSWr, which it lowers immediately to make sure that
remains true through register allocation.
Fixes#55833
Differential Revision: https://reviews.llvm.org/D127154
Implements ADDS/SUBS 24-bit immediate optimization using the
MIPeepholeOpt pass. This follows the pattern:
Optimize ([adds|subs] r, imm) -> ([ADDS|SUBS] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
Optimize ([adds|subs] r, imm) -> ([SUBS|ADDS] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
The SplitAndOpcFunc type had to change the return type to an Opcode pair so that
the first add/sub is the regular instruction and the second is the flag setting
instruction. This required updating the code in the AND case.
Testing:
I ran a two stage bootstrap with this code.
Using the second stage compiler, I verified that the negation of an ADDS to SUBS
or vice versa is a valid optimization. Example V == -0x111111.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118663
This reverts commit af45d0fd94b21620b61c8c4900b81486fd85aeb7.
This causes assertions failures when compiling the Linux kernel. See
https://reviews.llvm.org/D118663 for a reduced reproducer.
Implements ADDS/SUBS 24-bit immediate optimization using the
MIPeepholeOpt pass. This follows the pattern:
Optimize ([adds|subs] r, imm) -> ([ADDS|SUBS] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
Optimize ([adds|subs] r, imm) -> ([SUBS|ADDS] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
The SplitAndOpcFunc type had to change the return type to an Opcode pair so that
the first add/sub is the regular instruction and the second is the flag setting
instruction. This required updating the code in the AND case.
Testing:
I ran a two stage bootstrap with this code.
Using the second stage compiler, I verified that the negation of an ADDS to SUBS
or vice versa is a valid optimization. Example V == -0x111111.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118663
This fixes a bug where (SUBREG_TO_REG 0 (MOVi32imm <negative-number>) sub_32)
would generate invalid code since the top 32-bits were not zeroed when inspecting the
immediate value. A new test was added for this case.
Change to abstract shared behavior in MIPeepholeOpt. Both
visitAND and visitADDSUB attempt to split an RR instruction with an immediate
operand into two RI instructions with the immediate split.
The differing behavior lies in how the immediate is split into two pieces and
how the new instructions are built. The rest of the behavior (adding new VRegs,
checking for the MOVImm, constraining reg classes, removing old intructions)
are shared between the operations.
The new helper function splitTwoPartImm implements the shared behavior and
delegates differing behavior to two function objects passed by the caller.
One function object splits the immediate into two values and returns the
opcode to use if it is a valid split. The other function object builds
the new instructions.
I felt this abstraction would help since I believe it will help reduce the
code repetition when adding new instructions of the pattern, such as
SUBS for this conditional optimization.
Tested it locally by running check all with compiler-rt, mlir, clang-tools-extra,
flang, llvm, and clang enabled.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118000
Fixes the build issue with D111034, whose goal was to optimize
add/sub with long immediates.
Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
The change which fixed the build issue in D111034 was the use of new virtual
registers so that SSA form is maintained until deleting MI.
Differential Revision: https://reviews.llvm.org/D117429
Fixes the build issue with D111034, whose goal was to optimize
add/sub with long immediates.
Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.
The change which fixed the build issue in D111034 was the use of new virtual
registers so that SSA form is maintained until deleting MI.
Differential Revision: https://reviews.llvm.org/D117429
%3:gpr32 = ORRWrs $wzr, %2, 0
%4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32
If AArch64's 32-bit form of instruction defines the source operand of ORRWrs,
we can remove the ORRWrs because the upper 32 bits of the source operand are
set to zero.
Differential Revision: https://reviews.llvm.org/D110841
Create new virtual register for the definition of new AND instruction and
replace old register by the new one to keep SSA form.
Differential Revision: https://reviews.llvm.org/D109963
This reverts the revert commit f85d8a5bed95cc17a452b6b63b9866fbf181d94d
with bug fixes.
Original message:
MOVi32imm + ANDWrr ==> ANDWri + ANDWri
MOVi64imm + ANDXrr ==> ANDXri + ANDXri
The mov pseudo instruction could be expanded to multiple mov instructions later.
In this case, try to split the constant operand of mov instruction into two
bitmask immediates. It makes only two AND instructions intead of multiple
mov + and instructions.
Added a peephole optimization pass on MIR level to implement it.
Differential Revision: https://reviews.llvm.org/D109963
MOVi32imm + ANDWrr ==> ANDWri + ANDWri
MOVi64imm + ANDXrr ==> ANDXri + ANDXri
The mov pseudo instruction could be expanded to multiple mov instructions later.
In this case, try to split the constant operand of mov instruction into two
bitmask immediates. It makes only two AND instructions intead of multiple
mov + and instructions.
Added a peephole optimization pass on MIR level to implement it.
Differential Revision: https://reviews.llvm.org/D109963