llvm-project

Author	SHA1	Message	Date
Craig Topper	23d45e55ed	[MCP] Remove dead copies from basic blocks with successors. (#86973 ) Previously we wouldn't remove dead copies from basic blocks with successors. The comment said we didn't want to trust the live-in lists. The comment is very old so I'm not sure if that's still a concern today. This patch checks the live-in lists and removes copies from MaybeDeadCopies if they are referenced by any live-ins in any successors. We only do this if the tracksLiveness property is set. If that property is not set, we retain the old behavior.	2024-03-28 14:43:49 -07:00
David Green	313bf28f98	[ARM][MVE] Remove kill flags when reusing VPR register. (#86300 ) The vpr register may no longer be killed where it was, so we should be removing the kill flags.	2024-03-27 16:04:48 +00:00
James Westwood	b2c16e7ff4	Revert "[ARM] R11 not pushed adjacent to link register with PAC-M and… (#84019 ) … AAPCS frame chain fix (#82801)" This reverts commit 00e4a4197137410129d4725ffb82bae9ce44bdde. This patch was found to cause miscompilations and compilation failures.	2024-03-05 14:34:43 +00:00
James Westwood	00e4a41971	[ARM] R11 not pushed adjacent to link register with PAC-M and AAPCS frame chain fix (#82801 ) When code for M class architecture was compiled with AAPCS and PAC enabled, the frame pointer, r11, was not pushed to the stack adjacent to the link register. Due to PAC being enabled, r12 was placed between r11 and lr. This patch fixes this by adding an extra case to the already existing code that splits the GPR push in two when R11 is the frame pointer and certain paremeters are met. The differential revision for this previous change can be found here: https://reviews.llvm.org/D125649. This now ensures that r11 and lr are pushed in a separate push instruction to the other GPRs when PAC and AAPCS are enabled, meaning the frame pointer and link register are now pushed onto the stack adjacent to each other.	2024-03-04 12:00:36 +00:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Simon Pilgrim	b45de48be2	[MVE] Expand64BitShift - handle all constant shift amounts less than 32 (#81261 ) Expand64BitShift was always dropping to generic shift legalization if the shift amount type was larger than i64, even if the constant shift amount was actually very small. I've adjusted the constant bounds checks to work with APInt types so we can always perform the comparison. This results in the MVE long shift instructions being used more often, and it looks like this is preventing some additional combines from happening. This could be addressed in the future. This came about while I was trying to extend the DAGTypeLegalizer::ExpandShift* helpers and need to move to consistently using the legal shift amount types instead of reusing the shift amount type from the original wider shift.	2024-02-11 15:02:27 +00:00
Nikita Popov	b31fffbc7f	[ARM] Convert tests to opaque pointers (NFC)	2024-02-05 13:56:59 +01:00
Harald van Dijk	52864d9c7b	[ARM] Switch to soft promoting half types. (#80440 ) The traditional promotion is known to generate wrong code. Fixes #73805.	2024-02-02 21:40:40 +00:00
Quentin Dian	112fba974c	[MIRPrinter] Don't print line break when there is no instructions (NFC) (#80147 ) Per #80143, we can remove the extra line break when there is no instruction.	2024-02-01 22:10:52 +08:00
Yingwei Zheng	50e80e06d1	[ValueTracking] Merge `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` (#76360 ) This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` to improve the signbit inference. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2024-01-31 18:26:50 +08:00
Oskar Wirga	ff4636a4ab	Refactor recomputeLiveIns to converge on added MachineBasicBlocks (#79940 ) This is a fix for the regression seen in https://github.com/llvm/llvm-project/pull/79498 > Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. Now we do not recompute the entire CFG but we do ensure that the newly added MBB do reach convergence.	2024-01-30 19:33:04 -08:00
Nikita Popov	07a1925b8b	Revert "Refactor recomputeLiveIns to operate on whole CFG (#79498 )" This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b. Introduces a major compile-time regression.	2024-01-26 22:33:17 +01:00
Oskar Wirga	59bf60519f	Refactor recomputeLiveIns to operate on whole CFG (#79498 ) Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness.	2024-01-26 11:25:36 -08:00
David Green	2c49586e1b	[ARM] Fix MVEFloatOps check on creating VCVTN (#79291 ) In the past PerformSplittingToNarrowingStores handled both int and float ops, but since the introduction of MVETRUNC now only operates on float operations, creating VCVTN nodes. It should be guarded by hasMVEFloatOps to prevent a failure to select.	2024-01-25 08:12:51 +00:00
David Green	1074b94f5d	[ARM] Fix phi operand order issue in MVEGatherScatterLowering (#78208 ) With commuted operands on the phi node, the two old incoming values could be removed in the wrong order, removing newly added operand instead of the old one.	2024-01-16 10:15:05 +00:00
David Green	6719a5a3f6	[ARM] Extra test for MVE gather optimization with commuted phi operands. NFC	2024-01-15 19:28:55 +00:00
Florian Hahn	b1a5ee1feb	[ARM] Check all terms in emitPopInst when clearing Restored for LR. (#75527 ) emitPopInst checks a single function exit MBB. If other paths also exit the function and any of there terminators uses LR implicitly, it is not save to clear the Restored bit. Check all terminators for the function before clearing Restored. This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll where the machine-outliner previously introduced BLs that clobbered LR which in turn is used by the tail call return. Alternative to #73553	2023-12-20 16:56:15 +01:00
Philip Reames	ffb2af3ed6	[SCEVExpander] Attempt to reinfer flags dropped due to CSE (#72431 ) LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users. This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users. This should fix the regression reported in https://github.com/llvm/llvm-project/issues/71200. This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.	2023-12-07 13:20:36 -08:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Florian Hahn	20f634f275	[Thumb] Add test case where the machine-outliner clobbers LR. Add ad test case where `bl OUTLINED_FUNCTION_0` clobbers LR, which in turn is used the later call to memcpy to return to the caller.	2023-11-24 20:27:43 +00:00
Simon Pilgrim	de41396895	[DAG] foldABSToABD - add support for abs(sub(sign_extend_inreg(),sign_extend_inreg())) patterns Partial fix for ABDS regressions on D152928	2023-11-15 15:49:30 +00:00
Matthias Braun	e3cf80c5c1	BlockFrequencyInfoImpl: Avoid big numbers, increase precision for small spreads BlockFrequencyInfo calculates block frequencies as Scaled64 numbers but as a last step converts them to unsigned 64bit integers (`BlockFrequency`). This improves the factors picked for this conversion so that: * Avoid big numbers close to UINT64_MAX to avoid users overflowing/saturating when adding multiply frequencies together or when multiplying with integers. This leaves the topmost 10 bits unused to allow for some room. * Spread the difference between hottest/coldest block as much as possible to increase precision. * If the hot/cold spread cannot be represented loose precision at the lower end, but keep the frequencies at the upper end for hot blocks differentiable.	2023-10-24 20:27:39 -07:00
David Green	8a701024f3	[ARM] Lower i1 concat via MVETRUNC The MVETRUNC operation can perform the same truncate of two vectors, without requiring lane inserts/extracts from every vector lane. This moves the concat i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra stack space for less instructions.	2023-10-18 19:40:11 +01:00
David Green	c060757bcc	[ARM] Correct v2i1 concat extract types. For two v2i1 concat into a v4i1, we cannot extract each i64 element as an i32. This casts to a v4i32 instead and extracts the correct vector lanes.	2023-10-18 13:40:38 +01:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
JP Lehr	e816c89c84	Revert "InlineSpiller: Consider if all subranges are the same when avoiding redundant spills" This reverts commit d8127b2ba8a87a610851b9a462f2fc2526c36e37.	2023-10-02 06:26:33 -05:00
Matt Arsenault	d8127b2ba8	InlineSpiller: Consider if all subranges are the same when avoiding redundant spills This avoids some redundant spills of subranges, and avoids a compile failure. This greatly reduces the numbers of spills in a loop. The main range is not informative when multiple instructions are needed to fully define a register. A common scenario is a lowered reg_sequence where every subregister is sequentially defined, but each def changes the main range's value number. If we look at specific lanes at the use index, we can see the value is actually the same. In this testcase, there are a large number of materialized 64-bit constant defs which are hoisted outside of the loop by MachineLICM. These are feeding REG_SEQUENCES, which is not considered rematerializable inside the loop. After coalescing, the split constant defs produce main ranges with an apparent phi def. There's no phi def if you look at each individual subrange, and only half of the register is really redefined to a constant. Fixes: SWDEV-380865 https://reviews.llvm.org/D147079	2023-10-01 11:37:53 +03:00
Matt Arsenault	7252787dd9	RegAllocGreedy: Fix detection of lanes read by a bundle SplitKit creates questionably formed bundles of copies when it needs to copy a subset of live lanes and can't do it with a single subregister index. These are merely marked as part of a bundle, and don't start with a BUNDLE instruction. Queries for the slot index would give the first copy in the bundle, and we need to inspect the operands of all the other bundled copies. Also fix and simplify detection of read lane subsets. This causes some RISCV test regressions, but these look like accidentally beneficial splits. I don't see a subrange based reason to perform these splits. Avoids some really ugly regressions in a future patch. https://reviews.llvm.org/D146859	2023-10-01 11:37:48 +03:00
Jingu Kang	ff68e43c81	[MachineLICM] Handle Subloops It is a re-commit from reverted commit 3454cf67bd0a650097dc6ca99874a34e1d59b500. Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-26 14:25:11 +01:00
David Green	54e5de08d4	[ARM][LSR] Exclude uses outside the loop when favoring postinc. (#67090 ) Extra uses for variables outside the loop can mess with the generation of postinc variables. This patch alters the collection of loop invariant fixups in LSR when the target is optimizing for PostInc, to exclude the collection of these extra uses. It is expected that the variable can be rematerialized, which will lead to a more optimal sequence of instructions in the loop.	2023-09-25 10:09:36 +01:00
David Green	22f423aa46	[ARM] Add some extra testing for MVE postinc loops. NFC	2023-09-22 07:08:49 +01:00
Jay Foad	e0919b189b	[CodeGen] Renumber slot indexes before register allocation (#66334 ) RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.	2023-09-19 11:18:12 +01:00
Jay Foad	d8d0588f66	[TwoAddressInstruction] Update LiveIntervals after INSERT_SUBREG with undef read (#66211 ) Update LiveIntervals after rewriting: %reg = INSERT_SUBREG undef %reg, %subreg, subidx to: undef %reg:subidx = COPY %subreg D113044 implemented this for the non-undef case.	2023-09-18 14:51:58 +01:00
Guozhi Wei	cbdccb30c2	[RA] Split a virtual register in cold blocks if it is not assigned preferred physical register If a virtual register is not assigned preferred physical register, it means some COPY instructions will be changed to real register move instructions. In this case we can try to split the virtual register in colder blocks, if success, the original COPY instructions can be deleted, and the new COPY instructions in colder blocks will be generated as register move instructions. It results in fewer dynamic register move instructions executed. The new test case split-reg-with-hint.ll gives an example, the hot path contains 24 instructions without this patch, now it is only 4 instructions with this patch. Differential Revision: https://reviews.llvm.org/D156491	2023-09-15 19:52:50 +00:00
Benjamin Kramer	3454cf67bd	Revert "[MachineLICM] Handle Subloops" This reverts commit 5ec9699c4d1f165364586d825baef434e2c110b4. It accesses MI after it has been hoisted.	2023-09-15 13:20:31 +02:00
Jingu Kang	5ec9699c4d	[MachineLICM] Handle Subloops Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-14 18:07:31 +01:00
Simon Pilgrim	e6b85c3027	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case (REAPPLIED) Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Reapplied after reversion at e1e3c75c7dad72 with a tweak to the pseudo-probe-peep.ll test Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 12:33:39 +01:00
Simon Pilgrim	e1e3c75c7d	Revert rG6c56cf71ee82ec3a28e0dfc2b751bd10c16929da "[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case" Need to address a missed test change	2023-09-13 11:27:47 +01:00
Simon Pilgrim	6c56cf71ee	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 11:01:58 +01:00
John Brawn	fae3f9ec4f	[ARM] Fix prologue/epilogue for pacbti-m leaf functions R12 is callee-saved in functions with pacbti-m enabled, but this is done in assignCalleeSavedSpillSlots, meaning that in determineCalleeSaves we have to manually set CanEliminateFrame. This fixes a bug where in leaf functions with no other callee-saved registers the aut instruction wouldn't be emitted and stack offsets of arguments passed on the stack would be incorrect. Differential Revision: https://reviews.llvm.org/D157865	2023-09-04 13:46:01 +01:00
Serguei Katkov	a701b7e368	[CGP] Remove dead PHI nodes before elimination of mostly empty blocks Before elimination of mostly empty block it makes sense to remove dead PHI nodes. It open more opportunity for elimination plus eliminates dead code itself. It appeared that change results in failing many unit tests and some of them I've updated and for another one I disable this optimization. The pattern I observed in the tests is that there is a infinite loop without side effects. As a result after elimination of dead phi node all other related instruction are also removed and tests stops to check what it is expected. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D158503	2023-08-29 04:35:06 +00:00
Noah Goldstein	7c9fe735d4	[ValueTracking] Strengthen analysis in `computeKnownBits` of phi Use the comparison based analysis to strengthen the standard knownbits analysis rather than choosing either/or. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D157807	2023-08-22 10:59:03 -05:00
Nikita Popov	1c6e6432ca	[SCEVExpander] Fix incorrect reuse of more poisonous instructions (PR63763) SCEVExpander tries to reuse existing instruction with the same SCEV expression. However, doing this replacement blindly is not safe, because the instruction might be more poisonous. What we were already doing is to drop poison-generating flags on the reused instruction. But this is not the only way that more poison can be introduced. The poison-generating flag might not be directly on the reused instruction, or the poison contribution might come from something like 0 * %var, which folds to 0 but can still introduce poison. This patch fixes the issue in a principled way, by determining which values can contribute poison to the SCEV expression, and then checking whether any additional values can contribute poison to the instruction being reused. Poison-generating flags are dropped if doing that enables reuse. This is a pretty big hammer and does cause some regressions in tests, but less than I would have expected. I wasn't able to come up with a less intrusive fix that still satisfies the correctness requirements. Fixes https://github.com/llvm/llvm-project/issues/63763. Fixes https://github.com/llvm/llvm-project/issues/63926. Fixes https://github.com/llvm/llvm-project/issues/64333. Fixes https://github.com/llvm/llvm-project/issues/63727. Differential Revision: https://reviews.llvm.org/D158181	2023-08-22 09:27:07 +02:00
Jay Foad	56d92c1758	[MachineScheduler] Track physical register dependencies per-regunit Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Recommit after fixing AggressiveAntiDepBreaker in D156880. Differential Revision: https://reviews.llvm.org/D156552	2023-08-07 15:41:40 +01:00
Simon Tatham	60b98363c7	Retain all jump table range checks when using BTI. This modifies the switch-statement generation in SelectionDAGBuilder, specifically the part that generates case clusters of type CC_JumpTable. A table-based branch of any kind is at risk of being a JOP gadget, if it doesn't range-check the offset into the table. For some types of table branch, such as Arm TBB/TBH, the impact of this is limited because the value loaded from the table is a relative offset of limited size; for others, such as a MOV PC,Rn computed branch into a table of further branch instructions, the gadget is fully general. When compiling for branch-target enforcement via Arm's BTI system, many of these table branch idioms use branch instructions of types that do not require a BTI instruction at the branch destination. This avoids the need to put a BTI at the start of each case handler, reducing the number of available gadgets //with// BTIs (i.e. ones which could be used by a JOP attack in spite of the BTI system). But without a range check, the use of a non-BTI-requiring branch also opens up a larger range of followup gadgets for an attacker's use. A defence against this is to avoid optimising away the range check on the table offset, even if the compiler believes that no out-of-range value should be able to reach the table branch. (Rationale: that may be true for values generated legitimately by the program, but not those generated maliciously by attackers who have already corrupted the control flow.) The effect of keeping the range check and branching to an unreachable block is that no actual code is generated at that block, so it will typically point at the end of the function. That may still cause some kind of unpredictable code execution (such as executing data as code, or falling through to the next function in the code section), but even if so, there will only be //one// possible invalid branch target, rather than giving an attacker the choice of many possibilities. This defence is enabled only when branch target enforcement is in use. Without branch target enforcement, the range check is easily bypassed anyway, by branching in to a location just after it. But with enforcement, the attacker will have to enter the jump table dispatcher at the initial BTI and then go through the range check. (Or, if they don't, it's because they //already// have a general BTI-bypassing gadget.) Reviewed By: MaskRay, chill Differential Revision: https://reviews.llvm.org/D155485	2023-07-31 10:39:50 +01:00
Jay Foad	e2e3f06813	Revert "[MachineScheduler] Track physical register dependencies per-regunit" This reverts commit 1a54671d5405a39de362e9692ce963c0638023bc. It was causing lit test failures in a LLVM_ENABLE_EXPENSIVE_CHECKS build.	2023-07-29 18:05:25 +01:00
Jay Foad	1a54671d54	[MachineScheduler] Track physical register dependencies per-regunit Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Differential Revision: https://reviews.llvm.org/D156552	2023-07-29 15:34:53 +01:00
Jingu Kang	351b4c17dd	Revert "[MachineLICM] Handle Subloops" This reverts commit 50dd383d08670960540fecb4b48c0f0429fbfba3.	2023-07-20 17:12:25 +01:00
Jingu Kang	50dd383d08	[MachineLICM] Handle Subloops Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outmost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-07-20 16:39:13 +01:00

1 2 3 4 5 ...

1767 Commits