21 Commits

Author SHA1 Message Date
Micah Weston
c69af70f02 [AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt.
Implements ADDS/SUBS 24-bit immediate optimization using the
MIPeepholeOpt pass. This follows the pattern:

Optimize ([adds|subs] r, imm) -> ([ADDS|SUBS] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([adds|subs] r, imm) -> ([SUBS|ADDS] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

The SplitAndOpcFunc type had to change the return type to an Opcode pair so that
the first add/sub is the regular instruction and the second is the flag setting
instruction. This required updating the code in the AND case.

Testing:

I ran a two stage bootstrap with this code.
Using the second stage compiler, I verified that the negation of an ADDS to SUBS
or vice versa is a valid optimization. Example V == -0x111111.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D118663
2022-02-19 15:35:53 +00:00
Nathan Chancellor
22eb1dae3f
Revert "[AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt."
This reverts commit af45d0fd94b21620b61c8c4900b81486fd85aeb7.

This causes assertions failures when compiling the Linux kernel. See
https://reviews.llvm.org/D118663 for a reduced reproducer.
2022-02-13 10:40:23 -07:00
Micah Weston
af45d0fd94 [AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt.
Implements ADDS/SUBS 24-bit immediate optimization using the
MIPeepholeOpt pass. This follows the pattern:

Optimize ([adds|subs] r, imm) -> ([ADDS|SUBS] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([adds|subs] r, imm) -> ([SUBS|ADDS] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

The SplitAndOpcFunc type had to change the return type to an Opcode pair so that
the first add/sub is the regular instruction and the second is the flag setting
instruction. This required updating the code in the AND case.

Testing:

I ran a two stage bootstrap with this code.
Using the second stage compiler, I verified that the negation of an ADDS to SUBS
or vice versa is a valid optimization. Example V == -0x111111.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D118663
2022-02-12 03:13:14 +00:00
Micah Weston
f65651cc8a [AArch64] Fixes ADD/SUB opt bug and abstracts shared behavior in MIPeepholeOpt for ADD, SUB, and AND.
This fixes a bug where (SUBREG_TO_REG 0 (MOVi32imm <negative-number>) sub_32)
would generate invalid code since the top 32-bits were not zeroed when inspecting the
immediate value. A new test was added for this case.

Change to abstract shared behavior in MIPeepholeOpt. Both
visitAND and visitADDSUB attempt to split an RR instruction with an immediate
operand into two RI instructions with the immediate split.

The differing behavior lies in how the immediate is split into two pieces and
how the new instructions are built. The rest of the behavior (adding new VRegs,
checking for the MOVImm, constraining reg classes, removing old intructions)
are shared between the operations.

The new helper function splitTwoPartImm implements the shared behavior and
delegates differing behavior to two function objects passed by the caller.
One function object splits the immediate into two values and returns the
opcode to use if it is a valid split. The other function object builds
the new instructions.

I felt this abstraction would help since I believe it will help reduce the
code repetition when adding new instructions of the pattern, such as
SUBS for this conditional optimization.

Tested it locally by running check all with compiler-rt, mlir, clang-tools-extra,
flang, llvm, and clang enabled.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D118000
2022-01-26 04:22:27 +00:00
Micah Weston
93deac2e2b [AArch64] Optimize add/sub with immediate through MIPeepholeOpt
Fixes the build issue with D111034, whose goal was to optimize
add/sub with long immediates.

Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

The change which fixed the build issue in D111034 was the use of new virtual
registers so that SSA form is maintained until deleting MI.

Differential Revision: https://reviews.llvm.org/D117429
2022-01-22 12:39:22 +00:00
Florian Hahn
62476c7c14
Revert "[AArch64] Revive optimize add/sub with immediate through MIPeepholeOpt"
This reverts commit e6698f09929a134bf0f46d9347142b86d8f636a2.

This commit appears to introduce new machine verifier failures when
building the llvm-test-suite with `-mllvm -verify-machineinstrs` enabled:

https://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O3/11061/

FAILED: MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o
/Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite-build/tools/timeit --summary MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.time /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG  -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin    -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk   -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o   -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c
*** Bad machine code: Illegal virtual register for instruction ***
- function:    alloc_tree
- basic block: %bb.1 if.else (0x7fc0db8f8bb0)
- instruction: %31:gpr64 = nsw MADDXrrr killed %39:gpr64sp, killed %25:gpr64, $xzr
- operand 1:   killed %39:gpr64sp
Expected a GPR64 register, but got a GPR64sp register
fatal error: error in backend: Found 1 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c'.
4.	Running pass 'Verify generated machine code' on function '@alloc_tree'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  clang         0x000000011191896b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 43
1  clang         0x00000001119179b5 llvm::sys::RunSignalHandlers() + 85
2  clang         0x00000001119180e2 llvm::sys::CleanupOnSignal(unsigned long) + 210
3  clang         0x0000000111849f6a (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) + 106
4  clang         0x0000000111849ee8 llvm::CrashRecoveryContext::HandleExit(int) + 24
5  clang         0x0000000111914acc llvm::sys::Process::Exit(int, bool) + 44
6  clang         0x000000010f4e9be9 LLVMErrorHandler(void*, char const*, bool) + 89
7  clang         0x0000000114eba333 llvm::report_fatal_error(llvm::Twine const&, bool) + 323
8  clang         0x0000000110d8c620 (anonymous namespace)::MachineVerifier::BBInfo::~BBInfo() + 0
9  clang         0x0000000110cdddca llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 378
10 clang         0x00000001110b0154 llvm::FPPassManager::runOnFunction(llvm::Function&) + 1092
11 clang         0x00000001110b6268 llvm::FPPassManager::runOnModule(llvm::Module&) + 72
12 clang         0x00000001110b074a llvm::legacy::PassManagerImpl::run(llvm::Module&) + 986
13 clang         0x0000000111c20ad4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 3764
14 clang         0x0000000111f6dd31 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 1905
15 clang         0x00000001131a28b3 clang::ParseAST(clang::Sema&, bool, bool) + 643
16 clang         0x00000001122b02a4 clang::FrontendAction::Execute() + 84
17 clang         0x000000011222d6a9 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 873
18 clang         0x000000011232faf5 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 661
19 clang         0x000000010f4e9860 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 2544
20 clang         0x000000010f4e7168 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) + 312
21 clang         0x00000001120ab187 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const::$_1>(long) + 23
22 clang         0x0000000111849eb4 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 228
23 clang         0x00000001120aac24 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const + 324
24 clang         0x000000011207b85d clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const + 221
25 clang         0x000000011207bdad clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) const + 125
26 clang         0x0000000112092f7c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) + 204
27 clang         0x000000010f4e6977 main + 10375
28 libdyld.dylib 0x00007fff6be90cc9 start + 1
29 libdyld.dylib 0x0000000000000018 start + 18446603338705728336
clang-14: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 14.0.0 (https://github.com/llvm/llvm-project.git c90d136be4e055f1b409f38706d0fe3e2211af08)
Target: arm64-apple-darwin19.5.0
Thread model: posix
InstalledDir: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin
clang-14: note: diagnostic msg:
********************
2022-01-18 13:17:02 +00:00
Micah Weston
e6698f0992 [AArch64] Revive optimize add/sub with immediate through MIPeepholeOpt
Fixes the build issue with D111034, whose goal was to optimize
add/sub with long immediates.

Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

The change which fixed the build issue in D111034 was the use of new virtual
registers so that SSA form is maintained until deleting MI.

Differential Revision: https://reviews.llvm.org/D117429
2022-01-17 17:17:15 +00:00
David Green
43e500d791 [AArch64] Minor AArch64MIPeepholeOpt cleanup. NFC
We should always be in SSA form when running the pass, so turn a check
into an assert.
2021-12-28 19:10:01 +00:00
Ben Shi
59c3b48d99 Revert "[AArch64] Optimize add/sub with immediate"
This reverts commit 3de3ca3137bec5115cd10c53f4059f9bf1054e96.
2021-11-03 14:15:21 +08:00
Ben Shi
3de3ca3137 [AArch64] Optimize add/sub with immediate
Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Reviewed By: jaykang10, dmgreen

Differential Revision: https://reviews.llvm.org/D111034
2021-11-03 03:06:43 +00:00
Jingu Kang
a502436259 [AArch64] Remove redundant ORRWrs which is generated by zero-extend
%3:gpr32 = ORRWrs $wzr, %2, 0
%4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32

If AArch64's 32-bit form of instruction defines the source operand of ORRWrs,
we can remove the ORRWrs because the upper 32 bits of the source operand are
set to zero.

Differential Revision: https://reviews.llvm.org/D110841
2021-10-25 09:47:07 +01:00
Jingu Kang
3f0b178de2 [AArch64] Fixed a bug on AArch64MIPeepholeOpt
Create new virtual register for the definition of new AND instruction and
replace old register by the new one to keep SSA form.

Differential Revision: https://reviews.llvm.org/D109963
2021-10-18 08:55:42 +01:00
Ben Shi
d0dbc991c0 Revert "[AArch64] Optimize add/sub with immediate"
This reverts commit 9bf6bef9951a1c230796ccad2c5c0195ce4c4dff.
2021-10-16 22:17:18 +00:00
Ben Shi
9bf6bef995 [AArch64] Optimize add/sub with immediate
Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Reviewed By: jaykang10, dmgreen

Differential Revision: https://reviews.llvm.org/D111034
2021-10-16 08:50:39 +00:00
Jingu Kang
30caca39f4 Third Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"
This reverts the revert commit fc36fb4d23a5e419cf33002c87c0082f682cb77b with
bug fixes.

Differential Revision: https://reviews.llvm.org/D109963
2021-10-08 11:28:49 +01:00
David Spickett
fc36fb4d23 Revert "Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation""
This reverts commit 13f3c39f3658fa28cb008eb56a58d8e34697cd5d.

Due to test failures in stage 2 clang tests on AArch64 bots.
2021-10-06 08:39:48 +00:00
Jingu Kang
13f3c39f36 Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"
This reverts the revert commit c07f7099690e8607d119227db1f80ee21eff3a3b with
bug fixes.

Differential Revision: https://reviews.llvm.org/D109963
2021-09-30 09:27:08 +01:00
Sterling Augustine
c07f709969 Revert "Recommit "[AArch64] Split bitmask immediate of bitwise AND operation""
This reverts commit 73a196a11c0e6fe7bbf33055cc2c96ce3c61ff0d.

Causes crashes as reported in https://reviews.llvm.org/D109963
2021-09-28 18:02:06 -07:00
Jingu Kang
73a196a11c Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"
This reverts the revert commit f85d8a5bed95cc17a452b6b63b9866fbf181d94d
with bug fixes.

Original message:

    MOVi32imm + ANDWrr ==> ANDWri + ANDWri
    MOVi64imm + ANDXrr ==> ANDXri + ANDXri

    The mov pseudo instruction could be expanded to multiple mov instructions later.
    In this case, try to split the constant operand of mov instruction into two
    bitmask immediates. It makes only two AND instructions intead of multiple
    mov + and instructions.

    Added a peephole optimization pass on MIR level to implement it.

    Differential Revision: https://reviews.llvm.org/D109963
2021-09-28 15:26:29 +01:00
Jingu Kang
f85d8a5bed Revert "[AArch64] Split bitmask immediate of bitwise AND operation"
This reverts commit 864b206796ae8aa7f35f830655337751dbd9176c.

Reverting due to error on buildbots.
2021-09-28 13:28:09 +01:00
Jingu Kang
864b206796 [AArch64] Split bitmask immediate of bitwise AND operation
MOVi32imm + ANDWrr ==> ANDWri + ANDWri
MOVi64imm + ANDXrr ==> ANDXri + ANDXri

The mov pseudo instruction could be expanded to multiple mov instructions later.
In this case, try to split the constant operand of mov instruction into two
bitmask immediates. It makes only two AND instructions intead of multiple
mov + and instructions.

Added a peephole optimization pass on MIR level to implement it.

Differential Revision: https://reviews.llvm.org/D109963
2021-09-28 11:57:43 +01:00