llvm-project

Author	SHA1	Message	Date
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
Yuta Mukai	ea23761429	[AArch64] Verify ldp/stp alignment stricter (#84124 ) When ldp-aligned-only/stp-aligned-only is specified, modified to cancel ldp/stp transformation if MachineMemOperand is not present or the access size is unknown. In the previous implementation, the test passed when there was no MachineMemOperand. Also, if the size was unknown, an incorrect value was used or an assertion failed. (But actually, if there is no MachineMemOperand, it will be excluded from the target by isCandidateToMergeOrPair() before reaching the part.) A statistic NumFailedAlignmentCheck is added. NumPairCreated is modified so that it only counts if it is not canceled.	2024-03-06 20:19:56 +09:00
Florian Mayer	6f11c95d06	Revert "[AArch64] Verify ldp/stp alignment stricter" (#84096 ) Reverts llvm/llvm-project#83948 This broke the ASan buildbot: https://lab.llvm.org/buildbot/#/builders/168/builds/19054/steps/10/logs/stdio	2024-03-05 15:52:09 -08:00
Yuta Mukai	6b5888c27f	[AArch64] Verify ldp/stp alignment stricter (#83948 ) When ldp-aligned-only/stp-aligned-only is specified, modified to cancel ldp/stp transformation if MachineMemOperand is not present or the access size is unknown. In the previous implementation, the test passed when there was no MachineMemOperand. Also, if the size was unknown, an incorrect value was used or an assertion failed. (But actually, if there is no MachineMemOperand, it will be excluded from the target by isCandidateToMergeOrPair() before reaching the part.) A statistic NumFailedAlignmentCheck is added. NumPairCreated is modified so that it only counts if it is not cancelled.	2024-03-06 01:47:28 +09:00
David Green	915c3d9e5a	Revert "[AArch64] merge index address with large offset into base address" This reverts commit 32878c2065c8005b3ea30c79e16dfd7eed55d645 due to #79756 and #76202.	2024-01-28 17:01:21 +00:00
Sjoerd Meijer	e034f209f5	[AArch64LoadStoreOptimizer] Debug messages to track decision making. NFC (#77593 ) With these debug message it's possible to see why some pairs get rejected for combining.	2024-01-11 09:26:48 +00:00
Vitaly Buka	0ccc1e7acd	Revert "[AArch64] Fold more load.x into load.i with large offset" Issue #76202 This reverts commit f5687636415969e6d945659a0b78734abdfb0f06.	2023-12-21 21:12:40 -08:00
zhongyunde 00443407	f568763641	[AArch64] Fold more load.x into load.i with large offset The list of load.x is refer to canFoldIntoAddrMode on D152828. Also support LDRSroX missed in canFoldIntoAddrMode	2023-12-21 18:54:15 +08:00
zhongyunde 00443407	32878c2065	[AArch64] merge index address with large offset into base address A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE Fold mov w8, #56952 movk w8, #15, lsl #16 ldrb w0, [x0, x8] into add x0, x0, 1036288 ldrb w0, [x0, 3704] Only LDRBBroX is supported for the first time. Fix https://github.com/llvm/llvm-project/issues/71917	2023-12-21 18:54:14 +08:00
David Green	b6ee831b59	[AArch64] Load/store optimizer fixes and cleanup. This includes a couple of fixes after #71908 for bundles and some cleanup for the debug output. One was an iterator type that asserted on bundles, the second a rather subtle issue where forAllMIsUntilDef would hit the LdStLimit when renaming registers, meaning the last instruction was not updated leaving an invalid `ldp x6, x6` instruction.	2023-11-29 07:41:15 +00:00
Zhaoxuan Jiang	147c5d6686	[AArch64] Allow LDR merge with same destination register by renaming (#71908 ) The patch is based on a reverted patch: https://reviews.llvm.org/D103597. It was trying to rename registers before alias check, which is not safe and causes miscompiles. This patch does 2 things: 1. Do the renaming with necessary checks passed, including alias check. 2. Rename the register for the instructions between the pairs and combine the second load into the first. By doing so we can just check the renamability between the pairs and avoid scanning unknown amount of instructions before/after the pairs. Necessary refactoring has been made in order to reuse as much code possible with STR renaming.	2023-11-23 08:21:27 +00:00
Kazu Hirata	8842d59c9f	[llvm] Stop including llvm/ADT/BitVector.h (NFC) Identified with clangd.	2023-11-11 13:24:01 -08:00
Zhaoxuan Jiang	1f54ef78d5	[AArch64] Only clear kill flags if necessary when merging str (#69680 ) Previously the kill flags of the source register were unconditionally cleared when a `str` pair was merged, which results in suboptimal register allocation and inhibits some renaming opportunities which may allow further merging `str`.	2023-11-02 17:03:21 -07:00
Cullen Rhodes	54732a3e0b	[AArch64] Use TargetRegisterClass::hasSubClassEq in tryToFindRegisterToRename When renaming store operands for pairing in the load/store optimizer it tries to find an available register from the minimal physical register class of the original register. For each register it compares the equality of minimal physical register class of all sub/super registers with the minimal physical register class of the original register. Simply checking for register class equality can break once additional register classes are added, as was the case when adding: def foo : RegisterClass<"AArch64", [i32], 32, (sequence "W%u", 12, 15)> which broke: CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir CodeGen/AArch64/stp-opt-with-renaming.mir Since the introduction of the register class above, the rename register in test1 of the reserved regs test changed from x12 to x18. The reason for this is the minimal physical register class of x12 (as well as x13-x15) and its sub/super registers no longer matches that of x9 (GPR64noip_and_tcGPR64). Rather than selecting a matching register based on a comparison of the minimal physical register classes of the original and rename registers, this patch selects based on `MachineInstr::getRegClassConstraint` for the original register. It's worth mentioning the parameter passing registers (r0-r7) could be now be used as rename registers since the GPR32arg and GPR64arg register classes are subclasses of the minimal physical register class for x8 for example. I'm not entirely sure if we want to exclude those registers, if so maybe we could explicitly exclude those register classes. Reviewed By: efriedma, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88663	2023-10-30 08:47:39 +00:00
Zhuojia Shen	bcc5b48b0f	Reapply "[AArch64] Merge LDRSWpre-LD[U]RSW pair into LDPSWpre" This reverts commit 0def4e6b0f638b97a73bd4674365961d8fabda28, applies a quick fix that disallows merging two pre-indexed loads, and adds MIR regression tests. Differential Revision: https://reviews.llvm.org/D152407	2023-09-22 21:08:07 -07:00
Manos Anagnostakis	008f26b12e	[AArch64] New subtarget features to control ldp and stp formation (#66098 ) On some AArch64 cores, including Ampere's ampere1 and ampere1a architectures, load and store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded. Based on that, this patch introduces four new subtarget features, two for controlling ldp and two for controlling stp, to cover the ampere1 and ampere1a alignment needs and to enable optional fine-grained control over ldp and stp generation in general. The latter can be utilized by another cpu, if there are possible benefits with a different policy than the default provided by the compiler. More specifically, for each of the ldp and stp respectively we have: - disable-ldp/disable-stp: Do not emit ldp/stp. - ldp-aligned-only/stp-aligned-only: Emit ldp/stp only if the source pointer is aligned to at least double the alignment of the type. Therefore, for -mcpu=ampere1 and -mcpu=ampere1a ldp-aligned-only/stp-aligned-only become the defaults, because of the benefit from the alignment, whereas for the rest of the cpus the default behaviour of the compiler is maintained.	2023-09-14 16:58:39 +02:00
Alexander Kornienko	0def4e6b0f	Revert "[AArch64] Merge LDRSWpre-LD[U]RSW pair into LDPSWpre" This reverts commit b0093e13fcfdd4eea5bbd7ae57d3d1b82f4135c3 due to a miscompile under MSan. See https://reviews.llvm.org/D152407#4533478 for more details. Reviewed By: asmok-g Differential Revision: https://reviews.llvm.org/D156328	2023-07-26 16:22:24 +02:00
Zhuojia Shen	b0093e13fc	[AArch64] Merge LDRSWpre-LD[U]RSW pair into LDPSWpre This patch optimizes a pair of LDRSWpre and LDRSWui (or LDURSWi) instructions into a single LDPSWpre instruction. This is a missing case in D99272. MIR test cases in D152564 are updated to verify the optimization. Differential Revision: https://reviews.llvm.org/D152407	2023-07-18 09:46:47 -07:00
Zain Jaffal	0c93879d96	[AArch64] merge scaled and unscaled zero narrow stores. This patch fixes a crash when a sclaed and unscaled zero stores are merged. Differential Revision: https://reviews.llvm.org/D150963	2023-05-26 15:07:24 +01:00
Hsiangkai Wang	0847cc06a6	[NFC][AArch64] Use 'i' to encode the offset form of load/store. STG, STZG, ST2G, STZ2G are the exceptions to append 'Offset' to name the offset format of load/store instructions. All other load/store instructions use 'i' as the appendix. If there is no special reason to do so, we should make the naming consistent. Differential Revision: https://reviews.llvm.org/D141819	2023-03-06 12:34:19 +00:00
Kazu Hirata	c08fad8193	[llvm] Remove redundant initialization of std::optional (NFC)	2022-12-20 15:53:38 -08:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Kazu Hirata	298cb551fb	[AArch64] Use std::optional in AArch64LoadStoreOptimizer.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 22:08:30 -08:00
chenglin.bi	ec4db1d0dc	[AAArch64][Windows] Fix the crash when running ninja check-asan The crash comes from mismatch between load count in epilogue and seh instruction count. Still because of the pass AArch64LoadStoreOpt. It remove some load in the epilogue but haven't remove the corresponding seh instruction. This patch don't optimize the load in the epilogue to fix the issue. Fix: #58516 Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D136430	2022-10-21 22:11:54 +08:00
Eli Friedman	76ccd1db73	[AArch64] Don't form paired loads from epilogue operations on Windows AArch64LoadStoreOptimizer has a bunch of different guards to avoid corrupting Windows SEH prologues/epilogues, but apparently we missed the case of merging two instructions where the first instruction isn't part of the epilogue, but the second instruction is. Fixes issue discovered at https://reviews.llvm.org/D130049#3704064 Differential Revision: https://reviews.llvm.org/D134992	2022-10-04 11:41:59 -07:00
Kazu Hirata	258531b7ac	Remove redundant initialization of Optional (NFC)	2022-08-20 21:18:28 -07:00
zhongyunde	c42a225545	[MachineScheduler] Order more stores by ascending address According D125377, we order STP Q's by ascending address. While on some targets, paired 128 bit loads and stores are slow, so the STP will split into STRQ and STUR, so I hope these stores will also be ordered. Also add subtarget feature ascend-store-address to control the aggressive order. Reviewed By: dmgreen, fhahn Differential Revision: https://reviews.llvm.org/D126700	2022-06-13 17:33:50 +08:00
Zongwei Lan	ad73ce318e	[Target] use getSubtarget<> instead of static_cast<>(getSubtarget()) Differential Revision: https://reviews.llvm.org/D125391	2022-05-26 11:22:41 -07:00
Momchil Velikov	e0ff354b83	[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer [Re-commit after fixing a dereference of "end" iterator] The AArch64LoadStoreOptimnizer pass may merge a register increment/decrement with a following memory operation. In doing so, it may break CFI by moving a stack pointer adjustment past the CFI instruction that described that adjustment. This patch fixes this issue by moving said CFI instruction after the merged instruction, where the SP increment/decrement actually takes place. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D114547	2022-04-18 12:09:44 +01:00
Momchil Velikov	62d4686be3	Revert "[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer" This reverts commit ecbf32dd88fc91b4fe709dc14bb3493dda6e8854. It's possible this patch is the reason for an asertion failure `!NodePtr->isKnownSentinel()` in `AArch64LoadStoreOpt::mergeUpdateInsn` (https://lab.llvm.org/buildbot/#/builders/185/builds/1555) reverting while I investigate.	2022-04-14 09:33:40 +01:00
Momchil Velikov	ecbf32dd88	[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer The AArch64LoadStoreOptimnizer pass may merge a register increment/decrement with a following memory operation. In doing so, it may break CFI by moving a stack pointer adjustment past the CFI instruction that described that adjustment. This patch fixes this issue by moving said CFI instruction after the merged instruction, where the SP increment/decrement actually takes place. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D114547	2022-04-13 17:04:53 +01:00
Florian Hahn	d2c8aa0bf4	[AArch64] Pass Reg instead of MI to tryToFindRenameRegister (NFC). FirstMI is only used to get the load/store operand and the machine function. Pass the MF and register explicitly, so the helper can be used to find rename registers for other instructions in the future.	2022-03-01 14:02:02 +00:00
Florian Hahn	45c969defa	[AArch64] Remove unused argument from tryToFindRegisterToRename (NFC). The MI argument is not used by the function. Remove it.	2022-03-01 12:47:37 +00:00
Huihui Zhang	1d74b53172	[AArch64][LoadStoreOptimizer] Ignore undef registers when checking rename register used between paired instructions. The content of undef registers are not used in meaningful ways, when checking if a rename register is used between paired instructions we should ignore undef registers. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D119305	2022-02-10 10:21:37 -08:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
Tim Northover	3d41ef68e7	AArch64: don't form indexed paired ops if base reg overlaps operands. The registers involved might not be identical, but can still overlap (e.g. "str w0, [x0, #4]!").	2021-08-20 11:39:38 +01:00
Martin Storsjö	1cb7849a55	Revert "[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier" This reverts commit ea011ec5ed53599305de62ca5fcfd31f4b3448c3. This still causes some miscompiles, I'll follow up in the phabricator review with a sample of that issue (which is part of the sample of the previous issue).	2021-06-23 09:54:16 +03:00
Meera Nakrani	ea011ec5ed	[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier This is a recommit that fixes unwanted STP generation by checking that the base register has not been modified or used elsewhere. Our initial motivating case was memcpy's with alignments > 16. The loads/stores, to which small memcpy's expand, are kept together in several places so that we get a sequence like this for a 64 bit copy: LD w0 LD w1 ST w0 ST w1 The load/store optimiser can generate a LDP/STP w0, w1 from this because the registers read/written are consecutive. In our case however, the sequence is optimised during ISel, resulting in: LD w0 ST w0 LD w0 ST w0 This instruction reordering allows reuse of registers. Since the registers are no longer consecutive (i.e. they are the same), it inhibits LDP/STP creation. The approach here is to perform renaming: LD w0 ST w0 LD w1 ST w1 to enable the folding of the stores into a STP. We do not yet generate the LDP due to a limitation in the renaming implementation, but plan to look at that in a follow-up so that we fully support this case. While this was initially motivated by certain memcpy's, this is a general approach and thus is beneficial for other cases too, as can be seen in some test changes. Differential Revision: https://reviews.llvm.org/D103597	2021-06-22 15:29:13 +00:00
Martin Storsjö	99653702fd	Revert "[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier" This reverts commit d96ea46629803641038ebe46d8cd512f8cf7e20f, as it caused various misoptimizations, see https://reviews.llvm.org/D103597 for discussion on the issues.	2021-06-10 10:30:13 +03:00
Meera Nakrani	d96ea46629	[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier Our initial motivating case was memcpy's with alignments > 16. The loads/stores, to which small memcpy's expand, are kept together in several places so that we get a sequence like this for a 64 bit copy: LD w0 LD w1 ST w0 ST w1 The load/store optimiser can generate a LDP/STP w0, w1 from this because the registers read/written are consecutive. In our case however, the sequence is optimised during ISel, resulting in: LD w0 ST w0 LD w0 ST w0 This instruction reordering allows reuse of registers. Since the registers are no longer consecutive (i.e. they are the same), it inhibits LDP/STP creation. The approach here is to perform renaming: LD w0 ST w0 LD w1 ST w1 to enable the folding of the stores into a STP. We do not yet generate the LDP due to a limitation in the renaming implementation, but plan to look at that in a follow-up so that we fully support this case. While this was initially motivated by certain memcpy's, this is a general approach and thus is beneficial for other cases too, as can be seen in some test changes. Differential Revision: https://reviews.llvm.org/D103597	2021-06-09 11:25:26 +00:00
Stelios Ioannou	3f4bad5ead	[AArch64] Fix for the pre-indexed paired load/store optimization. This patch fixes an issue where a pre-indexed store e.g., STR x1, [x0, #24]! with a store like STR x0, [x0, #8] are merged into a single store: STP x1, x0, [x0, #24]! . They shouldn’t be merged because the second store uses x0 as both the stored value and the address and so it needs to be using the updated x0. Therefore, it should not be folded into a STP <>pre. Additionally a new test case is added to verify this fix. Differential Revision: https://reviews.llvm.org/D101888 Change-Id: I26f1985ac84e970961e2cdca23c590fa6773851a	2021-05-05 15:15:07 +01:00
Stelios Ioannou	936c777e2b	[AArch64] Adds a pre-indexed paired Load/Store optimization for LDR-STR. This patch merges STR<S,D,Q,W,X>pre-STR<S,D,Q,W,X>ui and LDR<S,D,Q,W,X>pre-LDR<S,D,Q,W,X>ui instruction pairs into a single STP<S,D,Q,W,X>pre and LDP<S,D,Q,W,X>pre instruction, respectively. For each pair, there is a MIR test that verifies this optimization. Differential Revision: https://reviews.llvm.org/D99272 Change-Id: Ie97a20c8c716c08492fe229c22e14e3c98ef08b7	2021-04-30 17:29:58 +01:00
Amara Emerson	0146d20631	[AArch64] Do not fold SP adjustments into pre-increment addr modes if it overflows the redzone. Instead of outright disabling this completely with the noredzone attribute, we only avoid doing the optimization if there are memory operations between the adjustment and the load/store that the adjustment would be folded into. This avoids the case of something like a stack cookie being corrupted if an exception happens before the pre-increment to the SP occurs. This also prevents the folding happening if we have a redzone, but the offset being folded is above the redzone amount (128 bytes in this case). rdar://73269336 Differential Revision: https://reviews.llvm.org/D95179	2021-02-24 09:55:48 -08:00
Martin Storsjö	f4b9dfd9bc	[AArch64] Don't merge sp decrement into later stores when using WinCFI This matches the corresponding existing case in AArch64LoadStoreOpt::findMatchingUpdateInsnForward. Both cases could also be modified to check MBBI->getFlag(FrameSetup/FrameDestroy) instead of forbidding any optimization involving SP, but the effect is probably pretty much the same. Differential Revision: https://reviews.llvm.org/D88541	2020-10-01 19:03:27 +03:00
Congzhe Cao	8d8cb1ad80	[AArch64] Avoid pairing loads when the base reg is modified When pairing loads, we should check if in between the two loads the base register has been modified. If that is the case then avoid pairing them because the second load actually loads from a different address. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D86956	2020-09-30 13:06:51 -04:00
Andrew Wei	c2deacd929	[AArch64] Fix ldst optimization of non-immediate store offset When matching store instruction for ldst opt, we should make sure store instr is in 'reg+imm' form as load instr, otherwise, it will have assertion in isLdOffsetInRangeOfSt since it will use getImm() directly. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87905	2020-09-23 23:00:13 +08:00
Congzhe Cao	4edb3d3646	[AArch64] Avoid pairing loads with same result reg When pairing ldr instructions to an ldp instruction, we cannot pair two ldr destination registers where one is a sub or super register of the other. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D86906	2020-09-22 16:25:08 -04:00
Florian Hahn	1975ff9a0a	[AArch64] Fix ldst-opt of multiple disjunct subregs. Currently aarch64-ldst-opt will incorrectly rename registers with multiple disjunct subregisters (e.g. result of LD3). This patch updates the canRenameUpToDef to bail out if it encounters such a register class that contains the register to rename. Fixes PR46105. Reviewers: efriedma, dmgreen, paquette, t.p.northover Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81108	2020-06-08 20:18:24 +01:00
Jean-Michel Gorius	505685a67a	[llvm][CodeGen] Check for memory instructions when querying for alias status Summary: Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction parameters does not access memory. This prevents calls to TargetInstrInfo::areMemAccessesTriviallyDisjoint with incompatible instructions. A side effect of this change is to render the mayAlias helper in the AArch64 load/store optimizer obsolete. We can now directly call the MachineInstr::mayAlias member function. Reviewers: hfinkel, t.p.northover, mcrosier, eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78823	2020-04-24 22:54:46 +02:00

1 2 3 4 5

206 Commits