llvm-project

Author	SHA1	Message	Date
Robert Imschweiler	a8ea7f4580	Reapply: [AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for callbr instruction with inline-asm (#152161 ) (#166195 ) Reapply #152161 with fixed 'changed' flags.	2025-11-03 20:59:48 +01:00
Robert Imschweiler	af68efc9c4	Revert "[AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for callbr instruction with inline-asm" (#166186 ) Reverts llvm/llvm-project#152161 Need to revert to fix changed logic for the expensive checks.	2025-11-03 16:33:20 +00:00
Robert Imschweiler	332f9b5eee	[AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for callbr instruction with inline-asm (#152161 ) Finishes adding inline-asm callbr support for AMDGPU, started by https://github.com/llvm/llvm-project/pull/149308.	2025-11-03 16:09:12 +01:00
Kazu Hirata	707bab651f	[llvm] Remove redundant typename (NFC) (#166087 ) Identified with readability-redundant-typename.	2025-11-02 13:15:16 -08:00
Valery Pykhtin	fc2afbda36	[AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (#150937 ) SSAUpdaterBulk replaces legacy SSAUpdater.	2025-10-14 20:07:10 +02:00
Vigneshwar Jayakumar	d9fa0de5c8	[StructurizeCFG] bug fix in zero cost hoist (#157969 ) This fixes a bug where zero cost instruction was hoisted to nearest common dominator but the hoisted instruction's operands didn't dominate the common dominator causing poison values.	2025-09-15 14:29:32 -05:00
Vigneshwar Jayakumar	df96e09c1e	[StructurizeCFG] nested-if zerocost hoist bugfix (#155408 ) When zero cost instructions are hoisted, the simplifyHoistedPhi function was setting incoming phi values which were not dominating the use causing runtime failure. This was set to poison by rebuildSSA function. This commit fixes the issue.	2025-08-28 09:04:14 -05:00
Kazu Hirata	11b4f110e0	[llvm] Remove unused includes of SmallSet.h (NFC) (#154893 ) We just replaced SmallSet<T , N> with SmallPtrSet<T , N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.	2025-08-22 10:33:46 -07:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Vigneshwar Jayakumar	56ae79a6ab	reland "[StructurizeCFG] Hoist and simplify zero-cost incoming else p… (#149744 ) …hi values (#139605)" This relands commit b11523b494b with the fix for llvm-buildbot failures "clang-hip-vega20" and "openmp-offload-amdgpu-runtime-2". The reland prevents hoisting the phi node which fixes the issue. Original PR description: The order of if and else blocks can introduce unnecessary VGPR copies. Consider the case of an if-else block where the incoming phi from the 'Else block' only contains zero-cost instructions, and the 'Then' block modifies some value. There would be no interference when coalescing because only one value is live at any point before structurization. However, in the structurized CFG, the Then value is live at 'Else' block due to the path if→flow→else, leading to additional VGPR copies. This patch addresses the issue by: - Identifying PHI nodes with zero-cost incoming values from the Else block and hoisting those values to the nearest common dominator of the Then and Else blocks. - Updating Flow PHI nodes by replacing poison entries (on the if→flow edge) with the correct hoisted values.	2025-07-25 15:23:45 -05:00
Vigneshwar Jayakumar	25c3f64105	Revert "[StructurizeCFG] Hoist and simplify zero-cost incoming else phi values" (#148016 ) reverting to fix Buildbot failures.	2025-07-10 13:06:38 -05:00
Vigneshwar Jayakumar	8d3f497eb8	[StructurizeCFG] Hoist and simplify zero-cost incoming else phi values (#139605 ) The order of if and else blocks can introduce unnecessary VGPR copies. Consider the case of an if-else block where the incoming phi from the 'Else block' only contains zero-cost instructions, and the 'Then' block modifies some value. There would be no interference when coalescing because only one value is live at any point before structurization. However, in the structurized CFG, the Then value is live at 'Else' block due to the path if→flow→else, leading to additional VGPR copies. This patch addresses the issue by: - Identifying PHI nodes with zero-cost incoming values from the Else block and hoisting those values to the nearest common dominator of the Then and Else blocks. - Updating Flow PHI nodes by replacing poison entries (on the if→flow edge) with the correct hoisted values.	2025-07-10 12:03:04 -05:00
Emma Pilkington	7babf22461	[StructurizeCFG] Stop setting DebugLocs in flow blocks (#139088 ) Flow blocks are generated code that don't really correspond to any location in the source, so principally they should have empty DebugLocs. Practically, setting these debug locs leads to redundant is_stmts being generated after #108251, causing stepping test failures in the ROCm GDB test suite. Fixes SWDEV-502134	2025-05-09 14:22:14 -04:00
Shilei Tian	fc55ad4ceb	Revert "[StructurizeCFG] Refactor insertConditions. NFC. (#115476 )" (#136370 )	2025-04-19 08:04:48 -04:00
Kazu Hirata	d8b078d550	[Transforms] Use llvm::append_range (NFC) (#133607 )	2025-03-29 18:57:50 -07:00
Kazu Hirata	673f4705a8	[llvm] Use Set::insert_range (NFC) (#133353 ) We can use Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E.first); down to: Set.insert_range(llvm::make_first_range(Range)); In some cases, we can further fold that into the set declaration.	2025-03-27 20:44:20 -07:00
Valery Pykhtin	bb2e85f12f	[AMDGPU] Improve StructurizeCFG pass performance: avoid redundant DebugLoc map initialization. NFC. (#130568 ) Previously, the TermDL (BB terminator → DebugLoc) map was initialized at the start of processing each function's region, creating entries for the entire function. This could be inefficient for large functions. This patch improves performance by creating map entries only when needed—when a terminator is being killed or when a flow block is created. Additionally, entries are removed immediately after use, preventing unnecessary map growth and ensuring DebugLocs are not "retracked." A mapless variant was also explored, but due to limited familiarity with the structurizer, it was not pursued further. In my cases, this change improves performance by 2-3×.	2025-03-11 07:19:53 +01:00
Matt Arsenault	2bada417c1	StructurizeCFG: Use poison instead of undef (#130459 ) There are a surprising number of codegen changes from this.	2025-03-10 22:29:15 +07:00
Kazu Hirata	7c7cebf49d	[Scalar] Avoid repeated hash lookups (NFC) (#130463 )	2025-03-09 00:48:17 -08:00
Pedro Lobo	9865296343	[StructurizeCFG] Use `poison` instead of `undef` as placeholder [NFC] (#119137 )	2024-12-10 15:03:44 +00:00
Jay Foad	231e63d816	[StructurizeCFG] Refactor insertConditions. NFC. (#115476 ) This just makes it more obvious that having Parent as the single predecessor is a special case, instead of checking for it in the middle of a loop that finds the nearest common dominator of multiple predecessors.	2024-11-26 09:40:33 +00:00
Jay Foad	b535e4ecac	[StructurizeCFG] Remove one SSAUpdater::AddAvailableValue. NFCI. (#115472 )	2024-11-08 17:20:29 +00:00
Jay Foad	107af4a62e	[StructurizeCFG] Introduce struct PredInfo. NFC. (#115457 ) This just provides a neater encapsulation of the info about the predicate for an edge, rather than ValueWeightPair aka std::pair.	2024-11-08 14:26:29 +00:00
Kazu Hirata	94f9cbbe49	[Scalar] Remove unused includes (NFC) (#114645 ) Identified with misc-include-cleaner.	2024-11-02 08:32:26 -07:00
Ruiling, Song	54d31bde32	Reapply "StructurizeCFG: Optimize phi insertion during ssa reconstruction (#101301 )" (#114347 ) This reverts commit be40c723ce2b7bf2690d22039d74d21b2bd5b7cf.	2024-11-01 08:29:59 +08:00
Juan Manuel Martinez Caamaño	b40ff5ac2d	[AMDGPU][StructurizeCFG] Maintain branch MD_prof metadata (#109813 ) Currently `StructurizeCFG` drops branch_weight metadata . This metadata can be generated from user annotations in the source code like: ```cpp if (...) [[likely]] { } ```	2024-09-25 13:15:23 +02:00
Kazu Hirata	a2f659c134	[StructurizeCFG] Avoid repeated hash lookups (NFC) (#107797 )	2024-09-09 07:15:12 -07:00
Matt Arsenault	f86da4cb7d	StructurizeCFG: Add SkipUniformRegions pass parameter to new PM version (#102812 ) Keep respecting the old cl::opt for now.	2024-08-12 15:13:15 +04:00
Yaxun (Sam) Liu	be40c723ce	Revert "StructurizeCFG: Optimize phi insertion during ssa reconstruction (#101301 )" This reverts commit c62e2a2a4ed69d53a3c6ca5c24ee8d2504d6ba2b. Since it caused regression in HIP buildbot: https://lab.llvm.org/buildbot/#/builders/123/builds/3282	2024-08-08 11:59:39 -04:00
Ruiling, Song	c62e2a2a4e	StructurizeCFG: Optimize phi insertion during ssa reconstruction (#101301 ) After investigating more while-break cases, I think we should try to optimize the way we reconstruct phi nodes. Previously, we reconstruct each phi nodes separately, but this is not optimal. For example: ``` header: %v.1 = phi float [ %v, %entry ], [ %v.2, %latch ] br i1 %cc, label %if, label %latch if: %v.if = fadd float %v.1, 1.0 br i1 %cc2, label %latch, label %exit latch: %v.2 = phi float [ %v.if, %if ], [ %v.1, %header ] br i1 %cc3, label %exit, label %header exit: %v.3 = phi float [ %v.2, %latch ], [ %v.if, %if ] ``` For this case, we have different copies of value `v`, but there is at most one copy of value `v` alive at any program point shown above. The existing ssa reconstruction will use the incoming values from the old deleted phi. Below is a possible output after ssa reconstruction. ``` header: %v.1 = phi float [ %v, %entry ], [ %v.loop, %Flow1 ] br i1 %cc, label %if, label %flow if: %v.if = fadd float %v.1, 1.0 br label %flow flow: %v.exit.if = phi float [ %v.if, %if ], [ undef, %header ] %v.latch = phi float [ %v.if, %if ], [ %v.1, %header ] latch: br label %flow1 flow1: %v.loop = phi float [ %v.latch, %latch ], [ undef, %Flow ] %v.exit = phi float [ %v.latch, %latch ], [ %v.exit.if, %Flow ] exit: %v.3 = phi float [ %v.exit, %flow1 ] ``` If we look closely, in order to reconstruct `v.1` `v.2` `v.3`, we are having two simultaneous copies of `v` alive at `flow` and `flow1`. We highly depend on register coalescer to coalesce them together. But register coalescer may not always be able to coalesce them because of the complexity in the chain of phi. On the other side, now that we have only one copy of `v` alive at any program point before the transform, why not simplify the phi network as much as we can? Look at the incoming values of these PHIs: ``` header if latch v.1: -- -- v.2 v.2: v.1 v.if -- v.3: -- v.if v.2 ``` If we let them share the same incoming values for these three different incoming blocks, then we would have only one copy of alive `v` at any program point after ssa reconstruction. Something like: ``` header: %v.1 = phi float [ %v, %entry ], [ %v.2, %Flow1 ] br i1 %cc, label %if, label %flow if: %v.if = fadd float %v.1, 1.0 br label %flow flow: %v.2 = phi float [ %v.if, %if ], [ %v.1, %header ] latch: br label %flow1 flow1: ... exit: %v.3 = phi float [ %v.2, %flow1 ] ```	2024-08-08 14:47:49 +08:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Ruiling, Song	ac24238002	[LowerSwitch] Don't let pass manager handle the dependency (#68662 ) Some passes has limitation that only support simple terminators: branch/unreachable/return. Right now, they ask the pass manager to add LowerSwitch pass to eliminate `switch`. Let's manage such kind of pass dependency by ourselves. Also add the assertion in the related passes.	2023-10-25 09:24:36 +08:00
Nuno Lopes	29d0b60430	[StructurizeCFG] Use poison instead of undef as placeholder [NFC] These are used to create branch instructions. The condition is patched later	2023-07-22 13:23:39 +01:00
pvanhout	e4ea2d5919	[StructurizeCFG] Correctly depend on UniformityAnalysis Small oversight in https://reviews.llvm.org/D145688 - the pass' dependency was not updated to reflect the change to UA. Also, change DivergenceAnalysis to UniformityAnalysis in a comment. That way, StructurizeCFG only refers to UA and not DA anymore.	2023-03-14 11:25:22 +01:00
pvanhout	240e2cba67	[StructurizeCFG] Use UniformityAnalysis instead of DivergenceAnalysis Depends on D145572 Reviewed By: foad, sameerds Differential Revision: https://reviews.llvm.org/D145688	2023-03-13 08:31:20 +01:00
Juan Manuel MARTINEZ CAAMAÑO	96ad51e3eb	[StructurizeCFG][DebugInfo] Avoid use-after-free Reviewed By: dstuttard Differential Revision: https://reviews.llvm.org/D137408	2022-11-04 13:39:49 +00:00
Juan Manuel MARTINEZ CAAMAÑO	256f8b06c6	[StructurizeCFG][DebugInfo] Maintain DILocations in the branches created by StructurizeCFG Make StructurizeCFG preserve the debug locations of the branch instructions it introduces. Differential Revision: https://reviews.llvm.org/D135967	2022-10-28 02:51:02 -05:00
Juan Manuel MARTINEZ CAAMAÑO	e9716c64ec	[StructurizeCFG] Remove imposible case and replace by assert In addition, replace outdated XFAIL test by a new one. Differential Revision: https://reviews.llvm.org/D134439	2022-09-29 08:27:49 +00:00
Ruiling Song	a5676a3a7e	StructurizeCFG: Set Undef for non-predecessors in setPhiValues() During structurization process, we may place non-predecessor blocks between the predecessors of a block in the structurized CFG. Take the typical while-break case as an example: ``` /---A(v=...) \| / \ ^ B C \| \ /\| \---L \| \ / E (r = phi (v:C)...) ``` After structurization, the CFG would be look like: ``` /---A \| \|\ \| \| C \| \|/ \| F1 ^ \|\ \| \| B \| \|/ \| F2 \| \|\ \| \| L \ \|/ \--F3 \| E ``` We can see that block B is placed between the predecessors(C/L) of E. During phi reconstruction, to achieve the same sematics as before, we are reconstructing the PHIs as: F1: v1 = phi (v:C), (undef:A) F3: r = phi (v1:F2), ... But this is also saying that `v1` would be live through B, which is not quite necessary. The idea in the change is to say the incoming value from B is Undef for the PHI in E. With this change, the reconstructed PHI would be: F1: v1 = phi (v:C), (undef:A) F2: v2 = phi (v1:F1), (undef:B) F3: r = phi (v2:F2), ... Reviewed by: sameerds Differential Revision: https://reviews.llvm.org/D132450	2022-09-26 09:54:47 +08:00
Ruiling Song	40e9284f3c	StructurizeCFG: prefer reduced number of live values The instruction simplification will try to simplify the affected phis. In some cases, this might extend the liveness of values. For example: BB0: \| \ \| BB1 \| / BB2:phi (BB0, v), (BB1, undef) The phi in BB2 will be simplified to v as v dominates BB2, but this is increasing the number of active values in BB1. By setting CanUseUndef to false, we will not simplify the phi in this way, this would help register pressure. This is mandatory for the later change to help reducing VGPR pressure for AMDGPU. Reviewed by: foad, sameerds Differential Revision: https://reviews.llvm.org/D132449	2022-09-26 09:54:47 +08:00
Kazu Hirata	6b1bc80188	[Scalar] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-20 21:18:25 -07:00
Kazu Hirata	e20d210eef	[llvm] Qualify auto (NFC) Identified with readability-qualified-auto.	2022-08-07 23:55:27 -07:00
Brendon Cahoon	c945d88d2b	Revert "[StructurizeCFG] Improve basic block ordering" This reverts commit f1b05a0a2bbbea160002be709f8a1c59de366761. Need to revert to due to issues identified with testing. The transformation is incorrect for blocks that contain convergent instructions.	2022-07-14 09:40:51 -05:00
Brendon Cahoon	f1b05a0a2b	[StructurizeCFG] Improve basic block ordering StructurizeCFG linearizes the successors of branching basic block by adding Flow blocks to record the true/false path for branches and back edges. This patch reduces the number of Phi values needed to capture the control flow path by improving the basic block ordering. Previously, StructurizeCFG adds loop exit blocks outside of the loop. StructurizeCFG sets a boolean value to indicate the path taken, and all exit block live values extend to after the loop. For loops with a large number of exits blocks, this creates a huge number of values that are maintained, which increases compilation time and register pressure. This is problem especially with ASAN, which adds early exits to blocks with unreachable instructions for each instrumented check in the loop. In specific cases, this patch reduces the number of values needed after the loop by moving the exit block into the loop. This is done for blocks that have a single predecessor and single successor by moving the block to appear just after the predecessor. Differential Revision: https://reviews.llvm.org/D123231	2022-06-22 16:10:41 -05:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Jay Foad	0e74d75a29	[StructurizeCFG] Fix boolean not bug D118623 added code to fold not-of-compare into a compare with the inverted predicate, if the compare had no other uses. This relies on accurate use lists in the IR but it was run before setPhiValues, when some phi inputs are still stored in a data structure on the side, instead of being real uses in the IR. The effect was that a phi that should be using the original compare result would now get an inverted result instead. Fix this by moving simplifyConditions after setPhiValues. Differential Revision: https://reviews.llvm.org/D120312	2022-02-22 17:36:20 +00:00
Jay Foad	d2e5d3512b	[StructurizeCFG] Clean up some boolean not instructions In some cases StructurizeCFG inserts i1 xor instructions to invert predicates. Add a quick loop to clean these up afterwards if we can get away with modifying an existing compare instruction instead. (StructurizeCFG is generally run late in the pipeline so instcombine does not clean them up for us.) Differential Revision: https://reviews.llvm.org/D118623	2022-02-01 09:35:37 +00:00
Kazu Hirata	5fc9e30985	[Scalar] Use range-based for loops (NFC)	2021-02-25 19:54:38 -08:00
Kazu Hirata	fb74e1e78a	[Transforms/Scalar] Use range-based for loops (NFC)	2021-02-04 21:18:05 -08:00

1 2 3

136 Commits