llvm-project

Author	SHA1	Message	Date
Sergei Barannikov	2ae9a74bf1	[CodeGen] Use `TRI::regunits()` (NFC) (#137356 )	2025-04-26 08:49:17 +03:00
Ulrich Weigand	be7ef6c52b	[MachineLICM] Recognize registers clobbered at EH landing pad entry (#122446 ) EH landing pad entry implicitly clobbers target-specific exception pointer and exception selector registers. The post-RA MachineLICM pass needs to take these into account when deciding whether to hoist an instruction out of the loop that initializes one of these registers. Fixes: https://github.com/llvm/llvm-project/issues/122315	2025-04-25 22:27:27 +02:00
Sergei Barannikov	c3959f22ab	[MachineLICM] Remove CurPreheader parameter that is always nullptr (#135554 ) Also, rename `getCurPreheader` -> `getOrCreatePreheader` to make it clear that this method may alter CFG. Update `Changed` if the method modified a loop by splitting a critical edge (this change is not strictly NFC). The removed parameter was probably intended to save compile time by not trying to a critical edge after the first attempt has failed, but it is only tried once per loop. PR: https://github.com/llvm/llvm-project/pull/135554	2025-04-15 05:03:33 +03:00
Kazu Hirata	dc5178cc41	[CodeGen] Use llvm::append_range (NFC) (#135567 )	2025-04-13 16:36:03 -07:00
Craig Topper	aaaaa4d256	[MachineLICM] Use Register. NFC	2025-03-02 22:33:26 -08:00
Daniel Paoliello	19032bfe87	[aarch64][win] Update Called Globals info when updating Call Site info (#122762 ) Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>). The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information. The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.	2025-01-13 14:00:31 -08:00
Nikita Popov	eeac0ffaf4	Revert "[MachineLICM] Use `RegisterClassInfo::getRegPressureSetLimit` (#119826 )" This reverts commit b4e17d4a314ed87ff6b40b4b05397d4b25b6636a. This causes a large compile-time regression.	2025-01-10 09:05:06 +01:00
Pengcheng Wang	b4e17d4a31	[MachineLICM] Use `RegisterClassInfo::getRegPressureSetLimit` (#119826 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from https://github.com/llvm/llvm-project/pull/118787	2025-01-09 21:05:52 +08:00
paperchalice	1562b70eaf	Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547 ) This relands commit #115111. Use traditional way to update post dominator tree, i.e. break critical edge splitting into insert, insert, delete sequence. When splitting critical edges, the post dominator tree may change its root node, and `setNewRoot` only works in normal dominator tree... See `6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)`	2024-12-13 11:43:09 +08:00
paperchalice	553058f825	Revert "[DomTreeUpdater] Move critical edge splitting code to updater" (#119512 ) Reverts llvm/llvm-project#115111 Causes #119511	2024-12-11 14:25:17 +08:00
paperchalice	79047fac65	[DomTreeUpdater] Move critical edge splitting code to updater (#115111 ) Support critical edge splitting in dominator tree updater. Continue the work in #100856. Compile time check: https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au	2024-12-11 11:31:42 +08:00
Florian Hahn	ef102b4a63	[MachineLICM] Don't allow hoisting invariant loads across mem barrier. (#116987 ) The improvements in 63917e1 / #70796 do not check for memory barriers/unmodelled sideeffects, which means we may incorrectly hoist loads across memory barriers. Fix this by checking any machine instruction in the loop is a load-fold barrier. PR: https://github.com/llvm/llvm-project/pull/116987	2024-11-21 10:25:04 +00:00
abhishek-kaushik22	46f43b6d92	[DebugInfo][InstrRef][MIR][GlobalIsel][MachineLICM] NFC Use std::move to avoid copying (#116935 )	2024-11-21 13:37:56 +05:30
Kazu Hirata	735ab61ac8	[CodeGen] Remove unused includes (NFC) (#115996 ) Identified with misc-include-cleaner.	2024-11-12 23:15:06 -08:00
paperchalice	fe63669282	[Instrumentation] Support `MachineFunction` in `OptNoneInstrumentation` (#115471 ) Support `MachineFunction` in `OptNoneInstrumentation`, also add `isRequired` to all necessary passes.	2024-11-09 16:50:11 +08:00
abhishek-kaushik22	6b64f36536	[NFC] Use `std::move` to avoid copy (#113080 )	2024-11-05 14:42:53 +00:00
Gaëtan Bossu	a0c318938a	[CodeGen][NFC] Properly split MachineLICM and EarlyMachineLICM (#113573 ) Both are based on MachineLICMBase, and the functionality there is "switched" based on a PreRegAlloc flag. This commit is simply about trusting the original value of that flag, defined by the `MachineLICM` and `EarlyMachineLICM` classes. The `PreRegAlloc` flag used to be overwritten it based on MRI.isSSA(), which is un-reliable due to how it is inferred by the MIRParser. I see that we can now define isSSA in MIR (thanks @gargaroff ), meaning the fix isn’t really needed anymore, but redefining that flag still feels wrong. Note that I'm looking into upstreaming more changes to MachineLICM, see [the discourse thread](https://discourse.llvm.org/t/extending-post-regalloc-machinelicm/82725).	2024-10-25 11:19:22 -07:00
Kazu Hirata	db9e1fb3bc	[MachineLICM] Avoid repeated hash lookups (NFC) (#110452 )	2024-09-30 06:49:04 -07:00
Jeremy Morse	056a3f4673	[NFC] Reapply 3f37c517f, SmallDenseMap speedups This time with 100% more building unit tests. Original commit message follows. [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.	2024-09-26 10:49:29 +01:00
Jeremy Morse	817e742ba5	Revert "[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417 )" This reverts commit 3f37c517fbc40531571f8b9f951a8610b4789cd6. Lo and behold, I missed a unit test	2024-09-25 14:31:30 +01:00
Jeremy Morse	3f37c517fb	[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417 ) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.	2024-09-25 14:22:23 +01:00
Akshat Oke	d2d78e584b	[NewPM][CodeGen] Port MachineLICM to NPM (#107376 )	2024-09-20 11:34:18 +05:30
Pengcheng Wang	ed4e75d5e5	[CodeGen] Remove AA parameter of isSafeToMove (#100691 ) This `AA` parameter is not used and for most uses they just pass a nullptr. The use of `AA` was removed since 8d0383e.	2024-07-26 15:47:47 +08:00
Jon Roelofs	b1f263e4c2	[llvm][MachineLICM] Fix a comment typo. NFC	2024-07-24 13:03:46 -07:00
paperchalice	099899961c	[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317 ) - Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.	2024-07-12 15:45:01 +08:00
Nikita Popov	6a907699d8	Revert "[CodeGen] Remove `applySplitCriticalEdges` in `MachineDominatorTree` (#97055 )" This reverts commit c5e5088033fed170068d818c54af6862e449b545. Causes large compile-time regressions.	2024-07-11 09:13:37 +02:00
paperchalice	c5e5088033	[CodeGen] Remove `applySplitCriticalEdges` in `MachineDominatorTree` (#97055 ) Summary: - Remove wrappers in `MachineDominatorTree`. - Remove `MachineDominatorTree` update code in `MachineBasicBlock::SplitCriticalEdge`. - Use `MachineDomTreeUpdater` in passes which call `MachineBasicBlock::SplitCriticalEdge` and preserve `MachineDominatorTreeWrapperPass` or CFG analyses. Commit abea99f65a97248974c02a5544eaf25fc4240056 introduced related methods in 2014. Now we have SemiNCA based dominator tree in 2017 and dominator tree updater, the solution adopted here seems a bit outdated.	2024-07-11 11:08:05 +08:00
paperchalice	79d0de2ac3	[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793 ) - Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.	2024-07-09 09:11:18 +08:00
Pierre van Houtryve	f0897ea4bb	[MachineLICM] Work-around Incomplete RegUnits (#95926 ) Reverts the behavior introduced by 770393b while keeping the refactored code. Fixes a miscompile on AArch64, at the cost of a small regression on AMDGPU. #96146 opened to investigate the issue	2024-06-20 10:59:00 +02:00
Jay Foad	457e895479	[CodeGen] Do not include $noreg in any regmask operands. NFCI. (#95775 ) Saying that a call preserves $noreg seems weird and required a workaround in MachineLICM.	2024-06-17 13:42:03 +01:00
Pierre van Houtryve	770393bb99	[MachineLICM] Correctly Apply Register Masks (#95746 ) Fix regression introduced in d4b8b72	2024-06-17 13:42:00 +02:00
Pierre van Houtryve	864981d72b	[NFC][MachineLICM] Use SmallDenseSet instead of SmallSet (#95201 ) All values are small so no reason to ever use SmallSet really. In large programs we'll end up using std::set which is extremely slow compared to DenseSet. This brings a decent speedup to the pass in large programs.	2024-06-12 11:34:54 +02:00
paperchalice	837dc542b1	[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571 ) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.	2024-06-11 21:27:14 +08:00
Pierre van Houtryve	d4b8b7217f	[CodeGen][MachineLICM] Use RegUnits in HoistRegionPostRA (#94608 ) Those BitVectors get expensive on targets like AMDGPU with thousands of registers, and RegAliasIterator is also expensive. We can move all liveness calculations to use RegUnits instead to speed it up for targets where RegAliasIterator is expensive, like AMDGPU. On targets where RegAliasIterator is cheap, this alternative can be a little more expensive, but I believe the tradeoff is worth it.	2024-06-11 14:27:35 +02:00
Pengcheng Wang	4e0bd3fab4	[MachineLICM] Hoist copies of constant physical register (#93285 ) Previously, we just check if the source is a virtual register and this prevents some potential hoists. We can see some improvements in AArch64/RISCV tests.	2024-05-29 14:10:01 +08:00
Matt Arsenault	39e24bdd8e	MachineLICM: Allow hoisting REG_SEQUENCE (#90638 )	2024-05-01 16:52:04 +02:00
Matt Arsenault	114a59d4d3	MachineLICM: Remove unnecessary isReg checks COPY operands are always registers.	2024-04-30 17:44:45 +02:00
michaelselehov	56ad6d1939	[MachineLICM] Hoist COPY instruction only when user can be hoisted (#81735 ) befa925acac8fd6a9266e introduced preliminary hoisting of COPY instructions when the user of the COPY is inside the same loop. That optimization appeared to be too aggressive and hoisted too many COPY's greatly increasing register pressure causing performance regressions for AMDGPU target. This is intended to fix the regression by hoisting COPY instruction only if either: - User of COPY can be hoisted (other args are invariant) or - Hoisting COPY doesn't bring high register pressure	2024-02-27 12:31:29 +00:00
Igor Kirillov	839abdb0d2	[MachineLICM] Fix incorrect CSE on hoisted const load (#73007 ) When hoisting an invariant load, we should not combine it with an existing load through common subexpression elimination (CSE). This is because there might be memory-changing instructions between the existing load and the end of the block entering the loop. Fixes https://github.com/llvm/llvm-project/issues/72855	2023-11-27 14:37:18 +00:00
Rin	befa925aca	[MachineLICM][AArch64] Hoist COPY instructions with other uses in the loop (#71403 ) When there is a COPY instruction in the loop with other uses, we want to hoist the COPY, which in turn leads to the users being hoisted as well. Co-authored-by David Green : David.Green@arm.com	2023-11-20 10:01:04 +00:00
Igor Kirillov	63917e1975	[MachineLICM] Allow hoisting loads from invariant address (#70796 ) Sometimes, loads can appear in a loop after the LICM pass is executed the final time. For example, ExpandMemCmp pass creates loads in a loop, and one of the operands may be an invariant address. This patch extends the pre-regalloc stage MachineLICM by allowing to hoist invariant loads from loops that don't have any stores or calls and allows load reorderings.	2023-11-16 11:12:10 +00:00
Hendrik Greving	2600aaab21	Revert "[MachineLICM] Relax overlay conservative PHI check (#67186 )" (#68580 ) This reverts commit 71a8d2e3064fcb3ff76565e6e8529613f90aa51b.	2023-10-09 05:26:58 -07:00
Hendrik Greving	71a8d2e306	[MachineLICM] Relax overlay conservative PHI check (#67186 ) Skip LICM if PHI belongs to the current loop, e.g. is in the loop's header. This prevents LICM from bailing for CFGs like L1: R = LoopInvariant // can be LICM'd BR L1 L2: PHI(R, ..) BR L2	2023-10-09 04:49:11 -07:00
Karl-Johan Karlsson	fa3a685926	[MachineLICM] Clear subregister kill flags (#67240 ) When hosting a loop invariant instruction the resulting register must be live in all the basic blocks of the loop body and the killed flags of the register must be cleared. Before this patch killed flags of subregister to a hoisted superregister was not cleared in the loop body. This was found in an out of tree target, but the testcase mlicm-stack-write-check.mir was modified to trigger the case.	2023-09-28 07:26:39 +02:00
Jingu Kang	ff68e43c81	[MachineLICM] Handle Subloops It is a re-commit from reverted commit 3454cf67bd0a650097dc6ca99874a34e1d59b500. Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-26 14:25:11 +01:00
Benjamin Kramer	3454cf67bd	Revert "[MachineLICM] Handle Subloops" This reverts commit 5ec9699c4d1f165364586d825baef434e2c110b4. It accesses MI after it has been hoisted.	2023-09-15 13:20:31 +02:00
Jingu Kang	5ec9699c4d	[MachineLICM] Handle Subloops Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-14 18:07:31 +01:00
Karl-Johan Johnsson	917574d5d8	[MachineLICM][WinEH] Don't hoist register reloads out of funclets This fixes https://github.com/llvm/llvm-project/issues/60766 With MSVC style exception-handling (funclets), no registers are alive when entering the funclet so they must be reloaded from the stack. MachineLICM can sometimes hoist such reloads out of the funclet which is not correct, the register will have been clobbered when entering the funclet. This can happen in any loop that contains a try-catch. This has been tested on x86_64-pc-window-msvc. I'm not sure if funclets work the same on the other windows archs. Reviewed By: rnk, arsenm Differential Revision: https://reviews.llvm.org/D153337	2023-08-13 23:58:16 +03:00
Jay Foad	11fbdd27fd	[CodeGen] Make use of isSubRegisterEq and isSuperRegisterEq. NFC.	2023-08-01 14:46:26 +01:00
Jingu Kang	351b4c17dd	Revert "[MachineLICM] Handle Subloops" This reverts commit 50dd383d08670960540fecb4b48c0f0429fbfba3.	2023-07-20 17:12:25 +01:00

1 2 3 4 5 ...

323 Commits