llvm-project

Author	SHA1	Message	Date
Florian Hahn	c4a78b6fe3	[SimplifyCFG] Always allow hoisting if all instructions match. (#97158 ) Generalize hoistCommonCodeFromSuccessors's `EqTermsOnly` to `AllInstsEqOnly` and always allow hoisting if all instructions match. In that case, all instructions can be hoisted and the original branch will be replaced and selects for PHIs are added. This allows preserving metadata in more cases, using the existing hoisting logic, whereas previously FoldTwoEntryPHINode would drop the metadata. https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u PR: https://github.com/llvm/llvm-project/pull/97158	2024-12-13 21:26:27 +00:00
Antonio Frighetto	d26df32255	[SimplifyCFG] Consider preds to switch in `simplifyDuplicateSwitchArms` Allow a duplicate basic block with multiple predecessors to the jump table to be simplified, by considering that the same basic block may appear in more switch cases.	2024-12-13 09:07:24 +01:00
Antonio Frighetto	e32c428bec	[SimplifyCFG] Precommit tests for PR118955 (NFC)	2024-12-13 09:07:24 +01:00
Nikita Popov	462cb3cd6c	[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144 ) If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-05 14:36:40 +01:00
Lee Wei	9bf6365237	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#118419 ) This PR removes tests with `br i1 undef` under `llvm/tests/Transforms/ObjCARC, Reassociate, SCCP, SLPVectorizer...`. After this PR, I'll continue to fix tests under `llvm/tests/CodeGen`, which has more UB tests than `llvm/tests/Transforms`.	2024-12-03 20:54:36 +00:00
AdityaK	39601a6e54	Bail out jump threading on indirect branches only (#117778 ) Remove check for PHI in pred as pointed out in #103688 Reduced the testcase to remove redundant phi in pred Fixes: #102351	2024-11-26 14:57:28 -08:00
Matt Arsenault	4028bb10c3	Local: Handle noalias_addrspace in combineMetadata (#103938 ) This should act like range. Previously ConstantRangeList assumed a 64-bit range. Now query from the actual entries. This also means that the empty range has no bitwidth, so move asserts to avoid checking the bitwidth of empty ranges.	2024-11-26 09:13:34 -05:00
Phoebe Wang	2568e52a73	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812 ) This is a follow up of #96878 to support hoisting load/store from BBs have the same predecessor, if load/store are the only instructions and the branch is unpredictable, e.g.: ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-11-25 15:19:28 +08:00
Stephen Tozer	2188a56a75	[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235 ) Currently when we merge invokes as part of SimplifyCFG we apply a merge of the invoke DILocations to the merged invoke. We also insert an unconditional branch to the merged invoke at the positions previously occupied by the original invokes; as this branch is part of the substitution for the invoke it has replaced, we should propagate the original invoke DebugLoc to it.	2024-11-15 17:20:55 +00:00
Michael Maitland	6b9952759f	[SimplifyCFG] Simplify switch instruction that has duplicate arms (#114262 ) I noticed that the two C functions emitted different IR: ``` int switch_duplicate_arms(int switch_val, int v, int w) { switch (switch_val) { default: break; case 0: w = v; break; case 1: w = v; break; } return w; } int if_duplicate_arms(int switch_val, int v, int w) { if (switch_val == 0) w = v; else if (switch_val == 1) w = v; return v0; } ``` We generate IR that looks like this: ``` define i32 @switch_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { switch i32 %1, label %7 [ i32 0, label %5 i32 1, label %6 ] 5: br label %7 6: br label %7 7: %8 = phi i32 [ %3, %4 ], [ %2, %6 ], [ %2, %5 ] ret i32 %8 } define i32 @if_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { %5 = icmp ult i32 %1, 2 %6 = select i1 %5, i32 %2, i32 %3 ret i32 %6 } ``` For `switch_duplicate_arms`, taking case 0 and 1 are the same since %5 and %6 branch to the same location and the incoming values for %8 are the same from those blocks. We could remove one on the duplicate switch targets and update the switch with the single target. On RISC-V, prior to this patch, we generate the following code: ``` switch_duplicate_arms: li a4, 1 beq a1, a4, .LBB0_2 mv a0, a3 bnez a1, .LBB0_3 .LBB0_2: mv a0, a2 .LBB0_3: ret if_duplicate_arms: li a4, 2 mv a0, a2 bltu a1, a4, .LBB1_2 mv a0, a3 .LBB1_2: ret ``` After this patch, the O3 code is optimized to the icmp + select pair, which gives us the same code gen as `if_duplicate_arms`, as desired. This results is one less branch instruction in the final assembly. This may help with both code size and further switch simplification. I found that this patch causes no significant impact to spec2006/int/ref and spec2017/intrate/ref. --------- Co-authored-by: Min Hsu <min@myhsu.dev>	2024-11-15 15:38:34 +01:00
Florian Hahn	40c75426a9	[SimplifyCFG] Add test for updating llvm.access.group when hoisting. Add extra test coverage for preserving llvm.access.group metadata when hoisting.	2024-11-12 13:14:30 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
elhewaty	9efb07f261	[IR] Add `samesign` flag to icmp instruction (#111419 ) Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423	2024-10-15 17:11:25 +08:00
Noah Goldstein	82ac399733	[SimplifyCFG] Allow merging invoke's with different attrs Same logic as other callsites, if the attributes are intersectable, we merge. Closes #111713	2024-10-10 01:07:59 -05:00
Noah Goldstein	cd04a9d401	[SimplifyCFG] Add/update tests for merging invokes with different attrs; NFC	2024-10-10 01:07:59 -05:00
Matt Arsenault	a8e1311a1c	[RFC] IR: Define noalias.addrspace metadata (#102461 ) This is intended to solve a problem with lowering atomics in OpenMP and C++ common to AMDGPU and NVPTX. In OpenCL and CUDA, it is undefined behavior for an atomic instruction to modify an object in thread private memory. In OpenMP, it is defined. Correspondingly, the hardware does not handle this correctly. For AMDGPU, 32-bit atomics work and 64-bit atomics are silently dropped. We therefore need to codegen this by inserting a runtime address space check, performing the private case without atomics, and fallback to issuing the real atomic otherwise. This metadata allows us to avoid this extra check and branch. Handle this by introducing metadata intended to be applied to atomicrmw, indicating they cannot access the forbidden address space.	2024-10-07 23:21:42 +04:00
Noah Goldstein	e343af777e	[SimplifyCFG][Attributes] Enabling sinking calls with differing number of attrsets Prior impl would fail if the number of attribute sets on the two calls wasn't the same which is unnecessary as long as we aren't throwing away and must-preserve attrs. Closes #110896	2024-10-02 15:15:07 -05:00
Noah Goldstein	baf008ac29	[SimplifyCFG] Add tests for sinking calls with differing number of attrs; NFC	2024-10-02 15:15:07 -05:00
Noah Goldstein	4d4beeb43c	[SimplifyCFG] Supporting hoisting/sinking callbases with differing attrs Some (many) attributes can safely be dropped to enable sinking. For example removing `nonnull` on a return/param can't affect correctness. Closes #109472	2024-10-01 18:27:08 -05:00
Noah Goldstein	c42659417f	[SimplifyCFG] Add tests for hoisting/sinking callbases with differing attrs; NFC	2024-10-01 18:27:08 -05:00
Nikita Popov	f445e39ab2	[SimplifyCFG] Use isWritableObject() API (#110127 ) SimplifyCFG store speculation currently has some homegrown code to check for a writable object, handling the alloca special case only. Switch it to use the generic isWritableObject() API, which means that we also support byval arguments, allocator return values, and writable arguments. I've adjusted isWritableObject() to also check for the noalias attribute when handling writable. Otherwise, I don't think that we can generalize from at-entry writability. This was not relevant for previous uses of the function, because they'd already require noalias for other reasons anyway.	2024-09-30 10:03:46 +02:00
Nikita Popov	8f21459777	[SimplifyCFG] Add additional store speculation tests (NFC)	2024-09-26 16:20:06 +02:00
Chengjun	e4688b98cd	[SimplifyCFG] Avoid increasing too many phi entries when removing empty blocks (#104887 ) Now in the simplifycfg and jumpthreading passes, we will remove the empty blocks (blocks only have phis and an unconditional branch). However, in some cases, this will increase size of the IR and slow down the compile of other passes dramatically. For example, we have the following CFG: 1. BB1 has 100 predecessors, and unconditionally branches to BB2 (does not have any other instructions). 2. BB2 has 100 phis. Then in this case, if we remove BB1, for every phi in BB2, we need to increase 99 entries (replace the incoming edge from BB1 with 100 edges from its predecessors). Then in total, we will increase 9900 phi entries, which can slow down the compile time for many other passes. Therefore, in this change, we add a check to see whether removing the empty blocks will increase lots of phi entries. Now, the threshold is 1000 (can be controlled by the command line option `max-phi-entries-increase-after-removing-empty-block`), which means that we will not remove an empty block if it will increase the total number of phi entries by 1000. This threshold is conservative and for most of the cases, we will not have such a large phi. So, this will only be triggered in some unusual IRs.	2024-09-25 12:41:13 +02:00
Nikita Popov	6f194a6dea	[SimplifyCFG] Avoid truncation in linear map overflow check This is supposed to test multiplication of the linear multiplifier with the largest value it can be multiplied with. However, if we truncate TableSize-1 here, it might not actually be the largest value. I think in practice this still works out, because in cases where we'd truncate the value here we'd also fail the NonMonotonic check. But to match the intent of the code, we should treat the truncating case as overflowing.	2024-09-23 15:13:32 +02:00
Phoebe Wang	7773dcd163	[X86][NFC] Change test name and add a new test (#109638 ) Address post commit comments in #108754.	2024-09-23 20:21:11 +08:00
Nikita Popov	8a6248b739	[SimplifyCFG] Don't separate a load/store from its gep during sinking (#102318 ) If we can sink the a load/store, but not the gep producing its pointer operand, don't sink the load/store either. This may prevent the gep from being folded into an addressing mode, and may also negatively affect further analysis. Fixes https://github.com/llvm/llvm-project/issues/96838.	2024-09-23 09:32:24 +02:00
Nikita Popov	5a4c6f9799	[Loads] Check context instruction for context-sensitive derefability (#109277 ) If a dereferenceability fact is provided through `!dereferenceable` (or similar), it may only hold on the given control flow path. When we use `isSafeToSpeculativelyExecute()` to check multiple instructions, we might make use of `!dereferenceable` information that does not hold at the speculation target. This doesn't happen when speculating instructions one by one, because `!dereferenceable` will be dropped while speculating. Fix this by checking whether the instruction with `!dereferenceable` dominates the context instruction. If this is not the case, it means we are speculating, and cannot guarantee that it holds at the speculation target. Fixes https://github.com/llvm/llvm-project/issues/108854.	2024-09-23 09:13:09 +02:00
Nikita Popov	30cdf1e959	[SimplifyCFG] Pass context instruction to isSafeToSpeculativelyExecute() (#109132 ) Pass speculation target and assumption cache to isSafeToSpeculativelyExecute() calls. This allows speculating based on dereferenceable/align assumptions, but the primary motivation here is to avoid regressions from planned changes to fix https://github.com/llvm/llvm-project/issues/108854.	2024-09-19 10:19:15 +02:00
Noah Goldstein	37932643ab	[SimplifyCFG] Deduce paths unreachable if they cause div/rem UB Same we way mark a path unreachable if it may cause a nullptr dereference, div/rem by zero or signed div/rem of INT_MIN by -1 cause immediate UB. Closes #109008	2024-09-18 12:59:52 -05:00
Noah Goldstein	f5d62d7647	[SimplifyCFG] Add tests for deducing paths unreachable if they cause div/rem UB; NFC	2024-09-18 12:59:52 -05:00
Nikita Popov	13b4d1bfea	[SimplifyCFG][LICM] Add additional speculation tests These are related to https://github.com/llvm/llvm-project/issues/108854.	2024-09-18 14:48:58 +02:00
Noah Goldstein	419c53477e	[SimplifyCFG] Mark div/rem as not-cheap to sink if we are replacing const denominator Close #109007	2024-09-17 12:04:34 -05:00
Noah Goldstein	ae8d0200b0	[SimplifyCFG] Add test for sinking div/rem with const remainder; NFC	2024-09-17 12:04:34 -05:00
Andreas Jonson	a0d00c94c2	[SimplifyCFG] Swap range metadata to attribute for calls. (#108984 ) Among the last usages of range metadata for call before being able to deprecate and only have the range attribute for calls.	2024-09-17 18:25:53 +02:00
Csanád Hajdú	bc8a5d104c	[Patchpoint] Add immarg attributes to patchpoint arguments (#97276 )	2024-09-17 14:00:24 +04:00
Phoebe Wang	af5a45b34b	[X86,SimplifyCFG] Use passthru to reduce select (#108754 )	2024-09-16 20:20:36 +08:00
AdityaK	3c9022c965	Bail out jump threading on indirect branches (#103688 ) The bug was introduced by https://github.com/llvm/llvm-project/pull/68473 Fixes: #102351	2024-09-10 22:39:02 -07:00
Shengchen Kan	87c86aa6b9	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part I) (#96878 ) This is simplifycfg part of https://github.com/llvm/llvm-project/pull/95515 In this PR, we support hoisting load/store with conditional faulting in `SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional branches. This is for cases like ``` void test (int a, int b) { if (a) b = a; } ``` In the following patches, we will support the hoist in `SimplifyCFGOpt::hoistCommonCodeFromSuccessors`. That is for cases like ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-08-29 10:42:44 +08:00
Nikita Popov	84497c6f4f	[SimplifyCFG] Remove limitation on sinking of load/store of alloca (#104788 ) This is a followup to https://github.com/llvm/llvm-project/pull/104579 to remove the limitation on sinking loads/stores of allocas entirely, even if this would introduce a phi node. Nowadays, SROA supports speculating load/store over select/phi. Additionally, SimplifyCFG with sinking only runs at the end of the function simplification pipeline, after SROA. I checked that the two tests modified here still successfully SROA after the SimplifyCFG transform. We should, however, keep the limitation on lifetime intrinsics. SROA does not have speculation support for these, and I've also found that the way these are handled in the backend is very problematic (https://github.com/llvm/llvm-project/issues/104776), so I think we should leave them alone.	2024-08-26 10:14:43 +02:00
Nikita Popov	4d85285ff6	[SimplifyCFG] Fold switch over ucmp/scmp to icmp and br (#105636 ) If we switch over ucmp/scmp and have two switch cases going to the same destination, we can convert into icmp+br. Fixes https://github.com/llvm/llvm-project/issues/105632.	2024-08-22 16:57:09 +02:00
Nikita Popov	716f7e2d18	[SimplifyCFG] Add tests for switch over cmp intrinsic (NFC)	2024-08-22 11:52:02 +02:00
Nikita Popov	b3fa45b642	[SimplifyCFG] Add support for hoisting commutative instructions (#104805 ) This extends SimplifyCFG hoisting to also hoist instructions with commuted operands, for example a+b on one side and b+a on the other side. This should address the issue mentioned in: https://github.com/llvm/llvm-project/pull/91185#issuecomment-2097447927	2024-08-20 12:48:06 +02:00
Nikita Popov	b64e7e07e5	[SimplifyCFG] Add tests for hoisting of commutative instructions (NFC)	2024-08-19 17:13:21 +02:00
Nikita Popov	83879f4f53	[SimplifyCFG] Don't block sinking for allocas if no phi created (#104579 ) SimplifyCFG sinking currently does not sink loads/stores of allocas, because historically SROA was unable to handle the resulting IR. Since then, SROA both learned to speculate loads/stores over selects and phis, and SimplifyCFG sinking has been deferred to the end of the function simplification pipeline, which means that SROA happens before it. As such, I believe that this workaround should no longer be necessary. Given how sensitive SimplifyCFG sinking seems to be, this patch takes a very conservative step towards removing this, by allowing sinking if we don't actually need to form a phi over the pointer argument. This fixes https://github.com/llvm/llvm-project/issues/104567, where sinking a store to an escaped alloca allows converting a switch into arithmetic.	2024-08-19 09:55:30 +02:00
Nikita Popov	65390f9d6f	[SimplifyCFG] Add test for #104567 (NFC)	2024-08-16 12:37:18 +02:00
Nikita Popov	1139dee910	[SimplifyCFG] Add more sinking tests (NFC)	2024-08-08 15:13:59 +02:00
Nikita Popov	999bab711e	[SimplifyCFG] Add tests for sinking of load/store + gep (NFC)	2024-08-07 16:51:18 +02:00
Jan Patrick Lehr	a347bdb2b8	Revert "[SimplifyCFG] Skip threading if the target may have divergent branches" (#100994 ) Reverts llvm/llvm-project#100185 See comments on PR (PR not accepted, outstanding review comments, breaks HIP-clang buildbot)	2024-07-29 11:34:26 +02:00
darkbuck	ba45453c0a	[SimplifyCFG] Skip threading if the target may have divergent branches - This patch skips the threading on known values if the target has divergent branch. - So far, threading on known values is skipped when the basic block has covergent calls. However, even without convergent calls, if that condition is divergent, threading duplicates the execution of that block threaded and hence results in lower performance. E.g., ``` BB1: if (cond) BB3, BB2 BB2: // work2 br BB3 BB3: // work3 if (cond) BB5, BB4 BB4: // work4 br BB5 BB5: ``` after threading, ``` BB1: if (cond) BB3', BB2' BB2': // work3 br BB5 BB3': // work2 // work3 // work4 br BB5 BB5: ``` After threading, work3 is executed twice if 'cond' is a divergent one. Reviewers: yxsamliu, nikic Pull Request: https://github.com/llvm/llvm-project/pull/100185	2024-07-26 12:15:49 -04:00
Tianqing Wang	03e92bf483	[SimplifyCFG] Fix LIT failure introduced in 3d494bfc7. (#100049 )	2024-07-23 09:50:04 +08:00

1 2 3 4 5 ...

1219 Commits