llvm-project

Author	SHA1	Message	Date
Fangrui Song	4b1b9e22b3	Remove unused #include "llvm/ADT/Optional.h"	2022-12-05 04:21:08 +00:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Kazu Hirata	20d764aff0	[llvm] Don't including SetVector.h (NFC) llvm/lib/ProfileData/RawMemProfReader.cpp uses SetVector without including SetVector.h, so this patch adds an appropriate #include there.	2022-09-17 12:36:43 -07:00
Fangrui Song	f9b5924975	[AArch64] Fix -Wunused-variable. NFC	2022-09-08 18:27:16 -07:00
zhongyunde	b6655333c2	[Peephole] rewrite INSERT_SUBREG to SUBREG_TO_REG if upper bits zero Restrict the 32-bit form of an instruction of integer as too many test cases will be clobber as the register number updated. From %reg = INSERT_SUBREG %reg, %subreg, subidx To %reg:subidx = SUBREG_TO_REG 0, %subreg, subidx Try to prefix the redundant mov instruction at D132325 as the SUBREG_TO_REG should not generate code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D132939	2022-09-09 09:00:54 +08:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
David Green	a1aef4f374	[AArch64] Remove ToBeRemoved from AArch64MIPeepholeOpt The ToBeRemoved is used to remove any MachineInstructions that are no longer needed, making sure we don't invalidate the iterator that is currently in use by erasing the instruction straight away. This makes issues for keeping the code in SSA from though, where subsequent transforms that require SSA form may have been broken by previous peepholes. If, instead, we use make_early_inc_range the iteration issue shouldn't be present, so long as we do not remove the subsequent instruction in the peephole optimizations. That way the code between transforms is kept in SSA form, meaning hopefully less things that can go wrong. Differential Revision: https://reviews.llvm.org/D127296	2022-06-08 17:26:07 +01:00
David Green	bccbf5276e	[AArch64] Remove isDef32 isDef32 would attempt to make a guess at which SelectionDag nodes were 32bit sources, and use the nature of 32bit AArch64 instructions implicitly zeroing the upper register half to not emit zext that were expected to already be zero. This was a bit fragile though, needing to guess at the correct opcodes that do not become 32bit defs later in ISel. This patch removed isDef32, relying on the AArch64MIPeephole optimizer to remove redundant SUBREG_TO_REG nodes. A part of SelectArithExtendedRegister was left with the same logic as a heuristic to prevent some regressions from it picking less optimal sequences. The AArch64MIPeepholeOpt pass also needs to be taught that a COPY from a FPR will become a FMOVSWr, which it lowers immediately to make sure that remains true through register allocation. Fixes #55833 Differential Revision: https://reviews.llvm.org/D127154	2022-06-07 18:57:59 +01:00
Micah Weston	c69af70f02	[AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt. Implements ADDS/SUBS 24-bit immediate optimization using the MIPeepholeOpt pass. This follows the pattern: Optimize ([adds\|subs] r, imm) -> ([ADDS\|SUBS] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([adds\|subs] r, imm) -> ([SUBS\|ADDS] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. The SplitAndOpcFunc type had to change the return type to an Opcode pair so that the first add/sub is the regular instruction and the second is the flag setting instruction. This required updating the code in the AND case. Testing: I ran a two stage bootstrap with this code. Using the second stage compiler, I verified that the negation of an ADDS to SUBS or vice versa is a valid optimization. Example V == -0x111111. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118663	2022-02-19 15:35:53 +00:00
Nathan Chancellor	22eb1dae3f	Revert "[AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt." This reverts commit af45d0fd94b21620b61c8c4900b81486fd85aeb7. This causes assertions failures when compiling the Linux kernel. See https://reviews.llvm.org/D118663 for a reduced reproducer.	2022-02-13 10:40:23 -07:00
Micah Weston	af45d0fd94	[AArch64] Adds SUBS and ADDS instructions to the MIPeepholeOpt. Implements ADDS/SUBS 24-bit immediate optimization using the MIPeepholeOpt pass. This follows the pattern: Optimize ([adds\|subs] r, imm) -> ([ADDS\|SUBS] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([adds\|subs] r, imm) -> ([SUBS\|ADDS] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. The SplitAndOpcFunc type had to change the return type to an Opcode pair so that the first add/sub is the regular instruction and the second is the flag setting instruction. This required updating the code in the AND case. Testing: I ran a two stage bootstrap with this code. Using the second stage compiler, I verified that the negation of an ADDS to SUBS or vice versa is a valid optimization. Example V == -0x111111. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118663	2022-02-12 03:13:14 +00:00
Micah Weston	f65651cc8a	[AArch64] Fixes ADD/SUB opt bug and abstracts shared behavior in MIPeepholeOpt for ADD, SUB, and AND. This fixes a bug where (SUBREG_TO_REG 0 (MOVi32imm <negative-number>) sub_32) would generate invalid code since the top 32-bits were not zeroed when inspecting the immediate value. A new test was added for this case. Change to abstract shared behavior in MIPeepholeOpt. Both visitAND and visitADDSUB attempt to split an RR instruction with an immediate operand into two RI instructions with the immediate split. The differing behavior lies in how the immediate is split into two pieces and how the new instructions are built. The rest of the behavior (adding new VRegs, checking for the MOVImm, constraining reg classes, removing old intructions) are shared between the operations. The new helper function splitTwoPartImm implements the shared behavior and delegates differing behavior to two function objects passed by the caller. One function object splits the immediate into two values and returns the opcode to use if it is a valid split. The other function object builds the new instructions. I felt this abstraction would help since I believe it will help reduce the code repetition when adding new instructions of the pattern, such as SUBS for this conditional optimization. Tested it locally by running check all with compiler-rt, mlir, clang-tools-extra, flang, llvm, and clang enabled. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118000	2022-01-26 04:22:27 +00:00
Micah Weston	93deac2e2b	[AArch64] Optimize add/sub with immediate through MIPeepholeOpt Fixes the build issue with D111034, whose goal was to optimize add/sub with long immediates. Optimize ([add\|sub] r, imm) -> ([ADD\|SUB] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([add\|sub] r, imm) -> ([SUB\|ADD] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. The change which fixed the build issue in D111034 was the use of new virtual registers so that SSA form is maintained until deleting MI. Differential Revision: https://reviews.llvm.org/D117429	2022-01-22 12:39:22 +00:00
Florian Hahn	62476c7c14	Revert "[AArch64] Revive optimize add/sub with immediate through MIPeepholeOpt" This reverts commit e6698f09929a134bf0f46d9347142b86d8f636a2. This commit appears to introduce new machine verifier failures when building the llvm-test-suite with `-mllvm -verify-machineinstrs` enabled: https://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O3/11061/ FAILED: MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite-build/tools/timeit --summary MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.time /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c * Bad machine code: Illegal virtual register for instruction * - function: alloc_tree - basic block: %bb.1 if.else (0x7fc0db8f8bb0) - instruction: %31:gpr64 = nsw MADDXrrr killed %39:gpr64sp, killed %25:gpr64, $xzr - operand 1: killed %39:gpr64sp Expected a GPR64 register, but got a GPR64sp register fatal error: error in backend: Found 1 machine code errors. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin/clang -DNDEBUG -B /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin -Wno-unused-command-line-argument -mllvm -verify-machineinstrs -O3 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS13.5.sdk -w -Werror=date-time -DTORONTO -MD -MT MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -MF MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o.d -o MultiSource/Benchmarks/Olden/health/CMakeFiles/health.dir/health.c.o -c /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module '/Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/test-suite/MultiSource/Benchmarks/Olden/health/health.c'. 4. Running pass 'Verify generated machine code' on function '@alloc_tree' Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 clang 0x000000011191896b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 43 1 clang 0x00000001119179b5 llvm::sys::RunSignalHandlers() + 85 2 clang 0x00000001119180e2 llvm::sys::CleanupOnSignal(unsigned long) + 210 3 clang 0x0000000111849f6a (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) + 106 4 clang 0x0000000111849ee8 llvm::CrashRecoveryContext::HandleExit(int) + 24 5 clang 0x0000000111914acc llvm::sys::Process::Exit(int, bool) + 44 6 clang 0x000000010f4e9be9 LLVMErrorHandler(void, char const, bool) + 89 7 clang 0x0000000114eba333 llvm::report_fatal_error(llvm::Twine const&, bool) + 323 8 clang 0x0000000110d8c620 (anonymous namespace)::MachineVerifier::BBInfo::~BBInfo() + 0 9 clang 0x0000000110cdddca llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 378 10 clang 0x00000001110b0154 llvm::FPPassManager::runOnFunction(llvm::Function&) + 1092 11 clang 0x00000001110b6268 llvm::FPPassManager::runOnModule(llvm::Module&) + 72 12 clang 0x00000001110b074a llvm::legacy::PassManagerImpl::run(llvm::Module&) + 986 13 clang 0x0000000111c20ad4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 3764 14 clang 0x0000000111f6dd31 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 1905 15 clang 0x00000001131a28b3 clang::ParseAST(clang::Sema&, bool, bool) + 643 16 clang 0x00000001122b02a4 clang::FrontendAction::Execute() + 84 17 clang 0x000000011222d6a9 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 873 18 clang 0x000000011232faf5 clang::ExecuteCompilerInvocation(clang::CompilerInstance) + 661 19 clang 0x000000010f4e9860 cc1_main(llvm::ArrayRef<char const>, char const, void) + 2544 20 clang 0x000000010f4e7168 ExecuteCC1Tool(llvm::SmallVectorImpl<char const>&) + 312 21 clang 0x00000001120ab187 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool) const::$_1>(long) + 23 22 clang 0x0000000111849eb4 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 228 23 clang 0x00000001120aac24 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool) const + 324 24 clang 0x000000011207b85d clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const&) const + 221 25 clang 0x000000011207bdad clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const> >&) const + 125 26 clang 0x0000000112092f7c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const> >&) + 204 27 clang 0x000000010f4e6977 main + 10375 28 libdyld.dylib 0x00007fff6be90cc9 start + 1 29 libdyld.dylib 0x0000000000000018 start + 18446603338705728336 clang-14: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 14.0.0 (https://github.com/llvm/llvm-project.git c90d136be4e055f1b409f38706d0fe3e2211af08) Target: arm64-apple-darwin19.5.0 Thread model: posix InstalledDir: /Users/buildslave/jenkins/workspace/test-suite-verify-machineinstrs-aarch64-O3/compiler/bin clang-14: note: diagnostic msg: *******************	2022-01-18 13:17:02 +00:00
Micah Weston	e6698f0992	[AArch64] Revive optimize add/sub with immediate through MIPeepholeOpt Fixes the build issue with D111034, whose goal was to optimize add/sub with long immediates. Optimize ([add\|sub] r, imm) -> ([ADD\|SUB] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([add\|sub] r, imm) -> ([SUB\|ADD] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. The change which fixed the build issue in D111034 was the use of new virtual registers so that SSA form is maintained until deleting MI. Differential Revision: https://reviews.llvm.org/D117429	2022-01-17 17:17:15 +00:00
David Green	43e500d791	[AArch64] Minor AArch64MIPeepholeOpt cleanup. NFC We should always be in SSA form when running the pass, so turn a check into an assert.	2021-12-28 19:10:01 +00:00
Ben Shi	59c3b48d99	Revert "[AArch64] Optimize add/sub with immediate" This reverts commit 3de3ca3137bec5115cd10c53f4059f9bf1054e96.	2021-11-03 14:15:21 +08:00
Ben Shi	3de3ca3137	[AArch64] Optimize add/sub with immediate Optimize ([add\|sub] r, imm) -> ([ADD\|SUB] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([add\|sub] r, imm) -> ([SUB\|ADD] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. Reviewed By: jaykang10, dmgreen Differential Revision: https://reviews.llvm.org/D111034	2021-11-03 03:06:43 +00:00
Jingu Kang	a502436259	[AArch64] Remove redundant ORRWrs which is generated by zero-extend %3:gpr32 = ORRWrs $wzr, %2, 0 %4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32 If AArch64's 32-bit form of instruction defines the source operand of ORRWrs, we can remove the ORRWrs because the upper 32 bits of the source operand are set to zero. Differential Revision: https://reviews.llvm.org/D110841	2021-10-25 09:47:07 +01:00
Jingu Kang	3f0b178de2	[AArch64] Fixed a bug on AArch64MIPeepholeOpt Create new virtual register for the definition of new AND instruction and replace old register by the new one to keep SSA form. Differential Revision: https://reviews.llvm.org/D109963	2021-10-18 08:55:42 +01:00
Ben Shi	d0dbc991c0	Revert "[AArch64] Optimize add/sub with immediate" This reverts commit 9bf6bef9951a1c230796ccad2c5c0195ce4c4dff.	2021-10-16 22:17:18 +00:00
Ben Shi	9bf6bef995	[AArch64] Optimize add/sub with immediate Optimize ([add\|sub] r, imm) -> ([ADD\|SUB] ([ADD\|SUB] r, #imm0, lsl #12), #imm1), if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned integers. Optimize ([add\|sub] r, imm) -> ([SUB\|ADD] ([SUB\|ADD] r, #imm0, lsl #12), #imm1), if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned integers. Reviewed By: jaykang10, dmgreen Differential Revision: https://reviews.llvm.org/D111034	2021-10-16 08:50:39 +00:00
Jingu Kang	30caca39f4	Third Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit fc36fb4d23a5e419cf33002c87c0082f682cb77b with bug fixes. Differential Revision: https://reviews.llvm.org/D109963	2021-10-08 11:28:49 +01:00
David Spickett	fc36fb4d23	Revert "Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"" This reverts commit 13f3c39f3658fa28cb008eb56a58d8e34697cd5d. Due to test failures in stage 2 clang tests on AArch64 bots.	2021-10-06 08:39:48 +00:00
Jingu Kang	13f3c39f36	Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit c07f7099690e8607d119227db1f80ee21eff3a3b with bug fixes. Differential Revision: https://reviews.llvm.org/D109963	2021-09-30 09:27:08 +01:00
Sterling Augustine	c07f709969	Revert "Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"" This reverts commit 73a196a11c0e6fe7bbf33055cc2c96ce3c61ff0d. Causes crashes as reported in https://reviews.llvm.org/D109963	2021-09-28 18:02:06 -07:00
Jingu Kang	73a196a11c	Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit f85d8a5bed95cc17a452b6b63b9866fbf181d94d with bug fixes. Original message: MOVi32imm + ANDWrr ==> ANDWri + ANDWri MOVi64imm + ANDXrr ==> ANDXri + ANDXri The mov pseudo instruction could be expanded to multiple mov instructions later. In this case, try to split the constant operand of mov instruction into two bitmask immediates. It makes only two AND instructions intead of multiple mov + and instructions. Added a peephole optimization pass on MIR level to implement it. Differential Revision: https://reviews.llvm.org/D109963	2021-09-28 15:26:29 +01:00
Jingu Kang	f85d8a5bed	Revert "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts commit 864b206796ae8aa7f35f830655337751dbd9176c. Reverting due to error on buildbots.	2021-09-28 13:28:09 +01:00
Jingu Kang	864b206796	[AArch64] Split bitmask immediate of bitwise AND operation MOVi32imm + ANDWrr ==> ANDWri + ANDWri MOVi64imm + ANDXrr ==> ANDXri + ANDXri The mov pseudo instruction could be expanded to multiple mov instructions later. In this case, try to split the constant operand of mov instruction into two bitmask immediates. It makes only two AND instructions intead of multiple mov + and instructions. Added a peephole optimization pass on MIR level to implement it. Differential Revision: https://reviews.llvm.org/D109963	2021-09-28 11:57:43 +01:00

30 Commits