llvm-project

Author	SHA1	Message	Date
Arne Stenkrona	ea2f5395b1	[SimplifyCFG] Avoid threading for loop headers (#151142 ) Updates SimplifyCFG to avoid jump threading through loop headers if -keep-loops is requested. Canonical loop form requires a loop header that dominates all blocks in the loop. If we thread through a header, we risk breaking its domination of the loop. This change avoids this issue by conservatively avoiding threading through headers entirely. Fixes: https://github.com/llvm/llvm-project/issues/151144	2025-08-18 09:46:55 +00:00
Andreas Jonson	5ae8a9b8ce	[SimplifyCfg] Handle trunc nuw i1 condition in Equality comparison. (#153051 ) proof: https://alive2.llvm.org/ce/z/WVt4-F	2025-08-17 09:53:40 +02:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Andreas Jonson	c6fd3d32c3	[SimplifyCfg] Add nneg to zext for switch to table conversion (#147180 )	2025-08-04 16:18:05 +02:00
LU-JOHN	a757f23404	[SimplifyCFG] Extend jump-threading to allow live local defs (#135079 ) Extend jump-threading to allow local defs that are live outside of the threaded block. Allow threading to destinations where the local defs are not live. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-07-31 09:44:14 -04:00
Nikita Popov	2c6eec219d	[Tests] Avoid lifetime intrinsics on non-allocas (NFC) Don't rely on auto-upgrade, instead either remove unnecessary casts or remove no longer applicable tests.	2025-07-23 15:05:43 +02:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Antonio Frighetto	c435cd1730	[SimplifyCFG] Cache unique predecessors in `simplifyDuplicateSwitchArms` Avoid repeatedly querying `getUniquePredecessor` for already-visited switch successors so as not to incur quadratic runtime. Fixes: https://github.com/llvm/llvm-project/issues/147239.	2025-07-18 08:33:42 +02:00
David Green	0967957d7a	[CostModel] Handle all cost kinds in getCmpSelInstrCost (#148233 ) Currently we always produce a cost of 1 for all CostKinds that are not RecipThroughput, which can underestimate the cost if the type has a higher legalization cost (like larger vectors). This relaxes it to cover all cost kinds.	2025-07-15 18:08:52 +01:00
Gábor Spaits	338fd8b12c	[SimplifyCFG] Transform switch to select when common bits uniquely identify one case (#145233 ) Fix #141753 . This patch introduces a new check, that tries to decide if the conjunction of all the values uniquely identify the accepted values by the switch.	2025-07-02 18:16:12 +02:00
Andreas Jonson	33c265ddf7	[SimplifyCFG] Use indexType from data layout in switch to table conversion (#146207 ) Generate the GEP with the index type that InstCombine will cast it to but use the knowledge that the index is unsigned.	2025-06-28 21:00:34 +02:00
Mircea Trofin	62f8281e08	[IR][PGO] Verify invalid `MD_prof` metadata on instructions (#145576 ) This PR places the validation of `MD_prof` instruction metadata in the Verifier.	2025-06-25 13:10:43 -07:00
Antonio Frighetto	1247fddf36	[SimplifyCFG] Relax `cttz` cost check in `simplifySwitchOfPowersOfTwo` We should be able to allow `simplifySwitchOfPowersOfTwo` transform to take place, as, on recent X86 targets, the weighted latency-size appears to be 2. This favours computing trailing zeroes and indexing into a smaller value table, over generating a jump table with an indirect branch, which overall should be more efficient.	2025-06-24 09:06:18 +02:00
Yingwei Zheng	7e1fa09ce2	[SimplifyCFG] Bail out on vector GEPs in `passingValueIsAlwaysUndefined` (#142526 ) Closes https://github.com/llvm/llvm-project/issues/142522.	2025-06-04 12:37:30 +08:00
Vitaly Buka	3cb967a2cd	[NFCI][PromoteMem2Reg] Don't handle the first successor out of order (#142464 ) Just for consistency, to avoid confusing conditions. `reverse` helps to avoid tests updates as nothing is changing for for successors count <=2. For #142461	2025-06-03 10:26:55 -07:00
Yingwei Zheng	1e08febf0a	[SimplifyCFG] Switch to use `paramHasNonNullAttr` (#125383 )	2025-06-02 12:20:13 +08:00
Nikita Popov	eee958285b	[SimplifyCFG] Only consider provenance capture in store speculation (#138548 ) The capture check here is to protect against concurrent accesses from other threads. This requires the provenance to escape.	2025-05-22 17:01:37 +02:00
Ellis Hoag	78f0af5d89	[SimplifyCFG][swifterror] Don't sink calls with swifterror params (#139015 ) We've encountered an LLVM verification failure when building Swift with the SimplifyCFG pass enabled. I found that https://reviews.llvm.org/D158083 fixed this pass by preventing sinking loads or stores of swifterror values, but it did not implement the same protection for call or invokes. In `Verifier.cpp` [here](`c685355811/llvm/lib/IR/Verifier.cpp (L4360-L4364)`) and [here](`c685355811/llvm/lib/IR/Verifier.cpp (L3661-L3662)`) we can see that swifterror values must also be used directly by call instructions.	2025-05-12 14:37:26 -07:00
Nikita Popov	a7bff2a1c6	[SimplifyCFG] Add test for addr-only capture in store speculation (NFC)	2025-05-05 17:30:01 +02:00
Stephen Tozer	d6bb786705	[DebugInfo] Propagate source loc from invoke to replacement branch (#137206 ) An existing transformation replaces invoke instructions with a call to the invoked function and a branch to the destination; when this happens, we propagate the invoke's source location to the call but not to the branch. This patch updates this behaviour to propagate to the branch as well. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-24 18:59:29 +01:00
Snehasish Kumar	2007dcfeb8	Reapply [Metadata] Preserve MD_prof when merging instructions when one is missing. (#135418 ) Preserve branch weight metadata when merging instructions if one of the instructions is missing metadata. This is similar in behaviour to what we do today for other types of metadata such as mmra, memprof and callsite metadata. Also add a legality check when merging prof metadata based on instruction type. Without this check GVN PRE optimizations result in prof metadata on phi nodes which break the module verifier. Build failure caught by https://lab.llvm.org/buildbot/#/builders/113/builds/6621 ``` !9185 = !{!"branch_weights", i32 3912, i32 802} Wrong number of operands !9185 = !{!"branch_weights", i32 3912, i32 802} fatal error: error in backend: Broken module found, compilation aborted! ``` Reverts #134200 with additional changes.	2025-04-17 08:22:19 -07:00
Andreas Jonson	ed43207306	[SimplifyCFG] Handle trunc condition in foldBranchToCommonDest. (#135490 ) proof: https://alive2.llvm.org/ce/z/v32Aof	2025-04-13 13:16:15 +02:00
Andreas Jonson	4dd80b73b0	[SimplifyCFG] test for trunc condition (NFC)	2025-04-12 12:25:40 +02:00
Snehasish Kumar	7f2abe8fd1	Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200 ) Reverts llvm/llvm-project#132433 I suspect this change caused a failure in the bolt build bot. https://lab.llvm.org/buildbot/#/builders/113/builds/6621 ``` !9185 = !{!"branch_weights", i32 3912, i32 802} Wrong number of operands !9185 = !{!"branch_weights", i32 3912, i32 802} fatal error: error in backend: Broken module found, compilation aborted! ```	2025-04-02 22:11:17 -07:00
Snehasish Kumar	c18994c7cd	[Metadata] Preserve MD_prof when merging instructions when one is missing. (#132433 ) Preserve branch weight metadata when merging instructions if one of the instructions is missing metadata. This is similar in behaviour to what we do today for other types of metadata such as mmra, memprof and callsite metadata.	2025-04-02 11:13:45 -06:00
Snehasish Kumar	dde0be9d97	[Metadata] Handle memprof, callsite merging when one is missing. (#132106 ) For memprof and callsite metadata we want to pick one deterministically and keep that even if one of them may be missing.	2025-04-02 11:10:02 -06:00
Phoebe Wang	369be311a7	[X86,SimplifyCFG] Support conditional faulting load or store only (#132032 ) This is to fix a bug when a target only support conditional faulting load, see test case hoist_store_without_cstore. Split `-simplifycfg-hoist-loads-stores-with-cond-faulting` into `-simplifycfg-hoist-loads-with-cond-faulting` and `-simplifycfg-hoist-stores-with-cond-faulting` to control conditional faulting load and store respectively.	2025-03-21 21:19:46 +08:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Gábor Spaits	a0b175cb34	[SimplifyCFG] Treat `extract oneuse(op.with.overflow),1` pattern as a single instruction (#128021 ) Closes #115683 . Overflow arithmetic instruction plus extract value are usually generated when a division is being replaced, but the zero check may still be there. In that case hoist these two instructions out of this basic block, and let later optimizations take care of the unnecessary zero checks.	2025-03-14 14:18:57 +01:00
Stephen Tozer	af68927a83	Do not treat llvm.fake.use as a debug instruction (#128684 ) The llvm.fake.use intrinsic is used to prevent certain values from being optimized out for the benefit of debug info; it is not, however, a debug or pseudo instruction itself and necessarily must not be treated as one, since its purpose is to act like a normal instruction. In the original commit that added them, the IR intrinsic however was treated as one in `getPrevNonDebugInstruction` (but _not_ in `getNextNonDebugInstruction`, or in the MIR equivalents). This patch correctly treats it as a non-debug instruction.	2025-02-25 14:49:59 +00:00
Nikita Popov	d8b2e432d6	[IR] Remove mul constant expression (#127046 ) Remove support for the mul constant expression, which has previously already been marked as undesirable. This removes the APIs to create mul expressions and updates tests to stop using mul expressions. Part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179	2025-02-14 09:28:57 +01:00
Florian Hahn	65640c1d4c	[AssumeBundles] Dereferenceable used in bundle only applies at assume. (#126117 ) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on https://github.com/llvm/llvm-project/pull/123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in https://github.com/llvm/llvm-project/pull/121789 PR: https://github.com/llvm/llvm-project/pull/126117	2025-02-13 20:41:23 +01:00
goldsteinn	a56ba1fab0	[ValueTracking] Handle recursive select/PHI in ComputeKnownBits (#114689 ) Finish porting #114008 to `KnownBits` (Follow up to #113707).	2025-01-22 11:51:18 -06:00
Teresa Johnson	3a423a10ff	[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359 ) This patch fixes a couple of places where memprof-related metadata (!memprof and !callsite) were being dropped, and one place where PGO metadata (!prof) was being dropped. All were due to instances of combineMetadata() being invoked. That function drops all metadata not in the list provided by the client, and also drops any not in its switch statement. Memprof metadata needed a case in the combineMetadata switch statement. For now we simply keep the metadata of the instruction being kept, which doesn't retain all the profile information when two calls with memprof metadata are being combined, but at least retains some. For the memprof metadata being dropped during call CSE, add memprof and callsite metadata to the list of known ids in combineMetadataForCSE. Neither memprof nor regular prof metadata were in the list of known ids for the callsite in MemCpyOptimizer, which was added to combine AA metadata after optimization of byval arguments fed by memcpy instructions, and similar types of optimizations of memcpy uses. There is one other callsite of combineMetadata, but it is only invoked on load instructions, which do not carry these types of metadata.	2025-01-02 12:11:59 -08:00
DaPorkchop_	cea738bc9a	[SimplifyCFG] Replace unreachable switch lookup table holes with poison (#94990 ) As discussed in #94468, this causes switch lookup table entries which are unreachable to be poison instead of filling them with a value from one of the reachable cases. --------- Co-authored-by: DianQK <dianqk@dianqk.net>	2024-12-26 07:47:26 +08:00
Dominik Steenken	fa9cef50b1	Only guard loop metadata that has non-debug info in it (#118825 ) This PR is motivated by a mismatch we discovered between compilation results with vs. without `-g3`. We noticed this when compiling SPEC2017 testcases. The specific instance we saw is fixed in this PR by modifying a guard (see below), but it is likely similar instances exist elsewhere in the codebase. The specific case fixed in this PR manifests itself in the `SimplifyCFG` pass doing different things depending on whether DebugInfo is generated or not. At the end of this comment, there is reduced example code that shows the behavior in question. The differing behavior has two root causes: 1. Commit https://github.com/llvm/llvm-project/commit/c07e19b adds loop metadata including debug locations to loops that otherwise would not have loop metadata 2. Commit https://github.com/llvm/llvm-project/commit/ac28efa6c100 adds a guard to a simplification action in `SImplifyCFG` that prevents it from simplifying away loop metadata So, the change in 2. does not consider that when compiling with debug symbols, loops that otherwise would not have metadata that needs preserving, now have debug locations in their loop metadata. Thus, with `-g3`, `SimplifyCFG` behaves differently than without it. The larger issue is that while debug info is not supposed to influence the final compilation result, commits like 1. blur the line between what is and is not debug info, and not all optimization passes account for this. This PR does not address that and rather just modifies this particular guard in order to restore equivalent behavior between debug and non-debug builds in this one instance. --- Here is a reduced version of a file from `f526.blender_r` that showcases the behavior in question: ```C struct LinkNode; typedef struct LinkNode { struct LinkNode next; void link; } LinkNode; void do_projectpaint_thread_ph_v_state() { int ps = do_projectpaint_thread_ph_v_state; LinkNode node; while (do_projectpaint_thread_ph_v_state) for (node = ps; node; node = node->next) ; } ``` Compiling this with and without DebugInfo, and then disassembling the results, leads to different outcomes (tested on SystemZ and X86). The reason for this is that the `SimplifyCFG` pass does different things in either case.	2024-12-20 15:15:51 +01:00
Florian Hahn	c4a78b6fe3	[SimplifyCFG] Always allow hoisting if all instructions match. (#97158 ) Generalize hoistCommonCodeFromSuccessors's `EqTermsOnly` to `AllInstsEqOnly` and always allow hoisting if all instructions match. In that case, all instructions can be hoisted and the original branch will be replaced and selects for PHIs are added. This allows preserving metadata in more cases, using the existing hoisting logic, whereas previously FoldTwoEntryPHINode would drop the metadata. https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u PR: https://github.com/llvm/llvm-project/pull/97158	2024-12-13 21:26:27 +00:00
Antonio Frighetto	d26df32255	[SimplifyCFG] Consider preds to switch in `simplifyDuplicateSwitchArms` Allow a duplicate basic block with multiple predecessors to the jump table to be simplified, by considering that the same basic block may appear in more switch cases.	2024-12-13 09:07:24 +01:00
Antonio Frighetto	e32c428bec	[SimplifyCFG] Precommit tests for PR118955 (NFC)	2024-12-13 09:07:24 +01:00
Nikita Popov	462cb3cd6c	[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144 ) If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-05 14:36:40 +01:00
Lee Wei	9bf6365237	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#118419 ) This PR removes tests with `br i1 undef` under `llvm/tests/Transforms/ObjCARC, Reassociate, SCCP, SLPVectorizer...`. After this PR, I'll continue to fix tests under `llvm/tests/CodeGen`, which has more UB tests than `llvm/tests/Transforms`.	2024-12-03 20:54:36 +00:00
AdityaK	39601a6e54	Bail out jump threading on indirect branches only (#117778 ) Remove check for PHI in pred as pointed out in #103688 Reduced the testcase to remove redundant phi in pred Fixes: #102351	2024-11-26 14:57:28 -08:00
Matt Arsenault	4028bb10c3	Local: Handle noalias_addrspace in combineMetadata (#103938 ) This should act like range. Previously ConstantRangeList assumed a 64-bit range. Now query from the actual entries. This also means that the empty range has no bitwidth, so move asserts to avoid checking the bitwidth of empty ranges.	2024-11-26 09:13:34 -05:00
Phoebe Wang	2568e52a73	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812 ) This is a follow up of #96878 to support hoisting load/store from BBs have the same predecessor, if load/store are the only instructions and the branch is unpredictable, e.g.: ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-11-25 15:19:28 +08:00
Stephen Tozer	2188a56a75	[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235 ) Currently when we merge invokes as part of SimplifyCFG we apply a merge of the invoke DILocations to the merged invoke. We also insert an unconditional branch to the merged invoke at the positions previously occupied by the original invokes; as this branch is part of the substitution for the invoke it has replaced, we should propagate the original invoke DebugLoc to it.	2024-11-15 17:20:55 +00:00
Michael Maitland	6b9952759f	[SimplifyCFG] Simplify switch instruction that has duplicate arms (#114262 ) I noticed that the two C functions emitted different IR: ``` int switch_duplicate_arms(int switch_val, int v, int w) { switch (switch_val) { default: break; case 0: w = v; break; case 1: w = v; break; } return w; } int if_duplicate_arms(int switch_val, int v, int w) { if (switch_val == 0) w = v; else if (switch_val == 1) w = v; return v0; } ``` We generate IR that looks like this: ``` define i32 @switch_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { switch i32 %1, label %7 [ i32 0, label %5 i32 1, label %6 ] 5: br label %7 6: br label %7 7: %8 = phi i32 [ %3, %4 ], [ %2, %6 ], [ %2, %5 ] ret i32 %8 } define i32 @if_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { %5 = icmp ult i32 %1, 2 %6 = select i1 %5, i32 %2, i32 %3 ret i32 %6 } ``` For `switch_duplicate_arms`, taking case 0 and 1 are the same since %5 and %6 branch to the same location and the incoming values for %8 are the same from those blocks. We could remove one on the duplicate switch targets and update the switch with the single target. On RISC-V, prior to this patch, we generate the following code: ``` switch_duplicate_arms: li a4, 1 beq a1, a4, .LBB0_2 mv a0, a3 bnez a1, .LBB0_3 .LBB0_2: mv a0, a2 .LBB0_3: ret if_duplicate_arms: li a4, 2 mv a0, a2 bltu a1, a4, .LBB1_2 mv a0, a3 .LBB1_2: ret ``` After this patch, the O3 code is optimized to the icmp + select pair, which gives us the same code gen as `if_duplicate_arms`, as desired. This results is one less branch instruction in the final assembly. This may help with both code size and further switch simplification. I found that this patch causes no significant impact to spec2006/int/ref and spec2017/intrate/ref. --------- Co-authored-by: Min Hsu <min@myhsu.dev>	2024-11-15 15:38:34 +01:00
Florian Hahn	40c75426a9	[SimplifyCFG] Add test for updating llvm.access.group when hoisting. Add extra test coverage for preserving llvm.access.group metadata when hoisting.	2024-11-12 13:14:30 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
elhewaty	9efb07f261	[IR] Add `samesign` flag to icmp instruction (#111419 ) Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423	2024-10-15 17:11:25 +08:00
Noah Goldstein	82ac399733	[SimplifyCFG] Allow merging invoke's with different attrs Same logic as other callsites, if the attributes are intersectable, we merge. Closes #111713	2024-10-10 01:07:59 -05:00

1 2 3 4 5 ...

1255 Commits