llvm-project

Author	SHA1	Message	Date
Guillaume Chatelet	12ccdd67aa	[NFC] Use proper getSliceAlign type in SROA	2022-06-10 12:37:41 +00:00
Sanjay Patel	6fedc6a2b4	Revert "[InstCombine] add narrowing transform for low-masked binop with zext operand" This reverts commit afa192cfb6049a15c5542d132d500b910b802c74. This can cause an infinite loop as shown with an example in the post-commit thread.	2022-06-10 08:25:10 -04:00
David Sherwood	8daaea206b	[InstCombine] Use +0.0 instead of -0.0 as the FP identity for some folds In foldSelectIntoOp we sometimes transform a select of a fadd into a fadd of a select, where we select between data and an identity value. For both fadd and fsub the identity is always -0.0, but if the nsz flag is set on the select instruction we can use +0.0 instead. Doing so then triggers other optimisations, such as when folding the select of masked load into a new masked load. Differential Revision: https://reviews.llvm.org/D126774	2022-06-10 12:42:34 +01:00
Bin Cheng	8b360c69e9	[FuncSpec]Fix assertion failure when value is not added to solver This patch improves the fix in D110529 to prevent from crashing on value with byval attribute that is not added in SCCP solver. Authored-by: sinan.lin@linux.alibaba.com Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D126355	2022-06-10 18:45:53 +08:00
Nikita Popov	d77f944832	[LoopInfo] Add getOutermostLoop() (NFC) This is a recurring pattern, add an API function for it.	2022-06-10 11:48:21 +02:00
David Green	4a5cb957a1	[AggressiveInstcombine] Conditionally fold saturated fptosi to llvm.fptosi.sat This adds a fold for aggressive instcombine that converts smin(smax(fptosi(x))) into a llvm.fptosi.sat, providing that the saturation constants are correct and the cost of the llvm.fptosi.sat is lower. Unfortunately, a llvm.fptosi.sat cannot always be converted back to a smin/smax/fptosi. The llvm.fptosi.sat intrinsic is more defined that the original, which produces poison if the original fptosi was out of range. The llvm.fptosi.sat will saturate any value, so needs to be expanded to a fptosi(fpmin(fpmax(x))), which can be worse for codegeneration depending on the target. So this change thais conditional on the backend reporting that the llvm.fptosi.sat is cheaper that the original smin+smax+fptost. This is a change to the way that AggressiveInstrcombine has worked in the past. Instead of just being a canonicalization pass, that canonicalization can be dependant on the target in certain specific cases. Differential Revision: https://reviews.llvm.org/D125755	2022-06-10 09:36:09 +01:00
chenglin.bi	de7a6ae1ff	[InstCombine] Optimize shl+lshr+and conversion pattern if `C1` and `C3` are pow2 and `Log2(C3)+C2 < BitWidth`: ((C1 << X) >> C2) & C3 -> X == (Log2(C3)+C2-Log2(C1)) ? C3 : 0; https://alive2.llvm.org/ce/z/Pus5bd Fix issue https://github.com/llvm/llvm-project/issues/55739 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D126617	2022-06-10 09:36:58 +08:00
Philip Reames	206f10d3f6	Plumb InstructionCost through unroll costing Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case. Differential Revision: https://reviews.llvm.org/D127305	2022-06-09 15:42:53 -07:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Sanjay Patel	afa192cfb6	[InstCombine] add narrowing transform for low-masked binop with zext operand https://alive2.llvm.org/ce/z/hRy3rE As shown in D123408, we can produce this pattern when moving cast around, and we already have a related fold for a binop with a constant operand.	2022-06-09 16:59:26 -04:00
Johannes Doerfert	6555558a80	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705. Patch broke AMD GPU OpenMP offload buildbots. https://lab.llvm.org/buildbot/#/builders/193/builds/13246	2022-06-09 17:04:01 +02:00
Johannes Doerfert	da50dab1ae	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good. Fixes: https://github.com/llvm/llvm-project/issues/54981	2022-06-09 16:48:53 +02:00
Johannes Doerfert	94841c713f	[Attributor] Try to delete stores and simplify stored values By default we should try to eliminate unused stores and simplify values stored while we are at it.	2022-06-09 16:48:53 +02:00
Johannes Doerfert	a3273c0c06	[Attributor] Ensure to use the proper liveness AA When determining liveness via Attributor::isAssumedDead(...) we might end up without a liveness AA or with one pointing into another function. Neither is helpful and we will avoid both from now on.	2022-06-09 16:48:53 +02:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Johannes Doerfert	ae10b8a582	[Attributor][FIX] Give registered simplification callbacks precedence We accidentally checked for constants before we looked for registered simplification callbacks. The latter needs to take precedence though.	2022-06-09 15:31:53 +02:00
Johannes Doerfert	982053e85e	[Attributor][NFC] Improve debug code and comments	2022-06-09 13:41:23 +02:00
Johannes Doerfert	0ece283f03	[Attributor] Add checks needed as we strengthen value simplify	2022-06-09 13:41:23 +02:00
Johannes Doerfert	393be12b74	[Attributor] Look at base values for align, nonnull, and deref Stripping bitcasts and 0-geps helps normalization and minimizes the impact of a follow up change.	2022-06-09 13:41:23 +02:00
Johannes Doerfert	cb8adf76f7	[Attributor] Simplify loads from constant globals If a global is constant and the initializer is known we can simplify loads from it as the value has to be the initializer.	2022-06-09 13:41:23 +02:00
Florian Hahn	85983ca42e	[VPlan] Replace remaining use of needsScalarIV. All information is already available in VPlan. Note that there are some test changes, because we now can correctly look through instructions like truncates to analyze the actual users. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123541	2022-06-09 12:05:37 +01:00
Johannes Doerfert	14899bc43d	[Attributor] Generalize interface from ConstantInt to Constant We can use constant to allow undef and there is no need to force integers in the API anyway. The user can decide if a non integer constant is fine or not.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	7a07b88f37	[Attributor][FIX] Replace call site argument uses, not values We need to be careful replacing values as call site arguments (IRPosition::IRP_CALL_SITE_ARGUMENT) is representing a use and not a value. This patch replaces the interface to take a IR position instead making it harder to misuse accidentally. It does not change our tests right now but a follow up exposed the potential footgun.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	1df6e171c3	[Attributor] Simplify (integer range) state handling We used to be very conservative when integer states were merged. Instead of adding the known range (which is large due to uncertainty) into the assumed range (which is hopefully small), we can also only allow to merge in both at the same time into their respective counterpart. This will ensure we keep the invariant that assumed is part of known.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	481b8f31df	[Attributor][NFC] Introduce helper struct We often use a context associated with a value. For now only one use case has been changed.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	4277c1be88	[Attributor][FIX] Avoid metadata and duplicate replication assertion When we recreate instructions as part of simplification we need to take care of debug metadata and replacing the value multiple times. For now, we handle both conservatively.	2022-06-09 12:00:26 +02:00
Biplob Mishra	d87bfa9ad0	[InstCombine] Combine instructions of type or/and where AND masks can be combined. The patch simplifies some of the patterns as below (A \| (B & C0)) \| (B & C1) -> A \| (B & C0\|C1) ((B & C0) \| A) \| (B & C1) -> (B & C0\|C1) \| A In some scenarios like byte reverse on half word, we can see this pattern multiple times and this conversion can optimize these patterns. Additionally this commit fixes the issue reported with the test case. int f(int a, int b) { int c = ((unsigned char)(a >> 23) & 925); if (a) c = (a >> 23 & b) \| ((unsigned char)(a >> 23) & 925) \| (b >> 23 & 157); return c; } The previous revision/commit did not check one-use of an intermediate value that this transform re-uses. When that value has another use, an existing transform will try to invert the transform here. By adding one-use checks, we avoid the infinite loops seen with the earlier commit. Differential Revision: https://reviews.llvm.org/D124119	2022-06-09 10:58:30 +01:00
Chenbing Zheng	38992d2c5e	[InstCombine] improve fold for icmp-ugt-ashr Existing condition for fold icmp ugt (ashr X, ShAmtC), C --> icmp ugt X, ((C + 1) << ShAmtC) - 1 missed some boundary. It cause this fold don't work for some cases, and the reason is due to signed number overflow. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D127188	2022-06-09 16:22:12 +08:00
Nikita Popov	56c9976d46	[IndVarSimplify] Don't assert that terminator is not SCEVable (PR55925) The IV widening code currently asserts that terminators aren't SCEVable -- however, this is not the case for invokes with a returned attribute. As far as I can tell, this assertions is not necessary -- even if we have a critical edge (the second test case), the trunc gets inserted in a legal position. Fixes https://github.com/llvm/llvm-project/issues/55925. Differential Revision: https://reviews.llvm.org/D127288	2022-06-09 10:12:13 +02:00
Fangrui Song	11136a6032	[DeadArgElim] Remove dead code after r128810	2022-06-08 21:11:54 -07:00
Florian Hahn	cedfd7a2e5	Recommit "[VPlan] Remove uneeded needsVectorIV check." This reverts commit 266ea446ab747671eb6c736569c3c9c5f3c53d11. The reasons for the revert have been addressed by cleaning up condition handling in VPlan and properly marking VPBranchOnMaskRecipe as using scalars. The test case for the revert from D123720 has been added in 3d663308a5d.	2022-06-08 14:06:45 +01:00
Chuanqi Xu	0e10f12844	[NFC] Remove commented cerr debugging loggings There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.	2022-06-08 15:58:06 +08:00
Chuanqi Xu	733d7cf964	[Debug] [Coroutines] Add deref operator for non complex expression Background: When we construct coroutine frame, we would insert a dbg.declare intrinsic for it: ``` %hdl = call void @llvm.coro.begin() ; would return coroutine handle call void @llvm.dbg.declare(metadata ptr %hdl, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) ``` And in the splitted coroutine, it looks like: ``` define void @coro_func.resume(ptr *hdl) { entry.resume: call void @llvm.dbg.declare(metadata ptr %hdl, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) } ``` And we would salvage the debug info by inserting a new alloca here: ``` define void @coro_func.resume(ptr %hdl) { entry.resume: %frame.debug = alloca ptr call void @llvm.dbg.declare(metadata ptr %frame.debug, metadata ![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression()) store ptr %hdl, %frame.debug } ``` But now, the problem comes since the `dbg.declare` refers to the address of that alloca instead of actual coroutine handle. I saw there are codes to solve the problem but it only applies to complex expression only. I feel if it is OK to relax the condition to make it work for `__coro_frame`. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D126277	2022-06-08 10:53:51 +08:00
Wael Yehia	0952cf5bbb	[InstCombine] decomposeSimpleLinearExpr should bail out on negative operands. InstCombine tries to rewrite %prod = mul nsw i64 %X, Scale %acc = add nsw i64 %prod, Offset %0 = alloca i8, i64 %acc, align 4 %1 = bitcast i8* %0 to i32* Use ( %1 ) into %prod = mul nsw i64 %X, Scale/4 %acc = add nsw i64 %prod, Offset/4 %0 = alloca i32, i64 %acc, align 4 Use (%0) But it assumes Scale is unsigned, and performs an unsigned division. So we should bail out if Scale cannot be interpreted as an unsigned safely. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126546	2022-06-08 00:57:25 +00:00
Sanjay Patel	cae993d4c8	[InstCombine] [InstCombine] reduce left-shift-of-right-shifted constant via demanded bits If we don't demand low bits and it is valid to pre-shift a constant: (C2 >> X) << C1 --> (C2 << C1) >> X https://alive2.llvm.org/ce/z/_UzTMP This is the reverse-order shift sibling to 82040d414b3c ( D127122 ). It seems likely that we would want to add this to the SDAG version of the code too to keep it on par with IR.	2022-06-07 18:43:27 -04:00
Sanjay Patel	a4d2c5ecaa	[InstCombine] reduce code duplication for accessing type; NFC	2022-06-07 18:43:27 -04:00
Philip Reames	89c4b29e8d	[GuardWidening] Fix a nasty cast bug in c2eccc6 c2eccc6 introduced a call to etHasNoUnsignedWrap which implicitly assumes that Inst is a OverflowingBinaryOperator. This is frequently untrue, but was not caught because cast<Ty>(X) has been broken, see https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context. I considered reverting this, but since doing so re-introduces a nasty miscompile of its own, I decided to fix forward instead. I'll note that this is a particularly nasty form of the cast<Ty>(X) issue. Because the cast was succeeding unexpected, we were writing data to instructions which weren't OBOs. This could result in near arbitrary data or memory corruption. I'm a bit shocked that the sanitizers didn't find this TBH.	2022-06-07 13:27:13 -07:00
Florian Hahn	b0c9a71be0	[VPlan] Handle VPInst without underlying instr in VPInterleavedAccess. This violation is hidden while `cast` is missing an isa assertion after D123901.	2022-06-07 21:00:49 +01:00
Martin Sebor	dd2a6d78ee	[InstCombine] Fold memchr of sequences of same characters Enhance memchr libcall folder to handle constant arrays consisting of one or two sequences of cosecutive equal characters. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126515	2022-06-07 13:45:10 -06:00
Martin Sebor	fb6627fa0c	[InstCombine] Add substr helper function (NFC). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126515	2022-06-07 13:27:36 -06:00
Sanjay Patel	82040d414b	[InstCombine] reduce right-shift-of-left-shifted constant via demanded bits If we don't demand high bits (zeros) and it is valid to pre-shift a constant: (C2 << X) >> C1 --> (C2 >> C1) << X https://alive2.llvm.org/ce/z/P3dWDW There are a variety of related patterns, but I haven't found a single solution that gets all of the motivating examples - so pulling this piece out of D126617 along with more tests. We should also handle the case where we shift-right followed by shift-left, but I'll make that a follow-on patch assuming this one is ok. It seems likely that we would want to add this to the SDAG version of the code too to keep it on par with IR. Differential Revision: https://reviews.llvm.org/D127122	2022-06-07 13:28:18 -04:00
Craig Topper	d73684e223	[LoopFlatten] Fix crash if the inner loop trip count comes from a sext instruction. If we look through a truncate in matchLinearIVUser, it's possible we find a sext/zext instruction that didn't come from widening. This will fail the MatchedItCount->getType() == InnerInductionPHI->getType() assertion. Fix this by checking that we did not look through a truncate already. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127149	2022-06-07 08:21:21 -07:00
Craig Topper	fdd5843572	[LoopFlatten] Replace unchecked dyn_cast with cast. Spotted while reading through the code. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127146	2022-06-07 08:21:00 -07:00
David Sherwood	997ecb0036	[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding Based on reviewer comments on https://reviews.llvm.org/D126692 I've added FastMathFlags to the select instruction used when tail-folding with reductions. These flags can then be used by InstCombine to decide upon the most optimal floating point identity value for fadd/fsub. Doing so unlocks further optimisations, such as folding selects into masked loads. Differential Revision: https://reviews.llvm.org/D126778	2022-06-07 10:21:31 +01:00
Nikita Popov	7fa97b473c	[SCCP] Don't mark ranges from branch conditions as potentially undef Now that transforms introducing branch on poison have been removed, we can stop marking ranges that have been derived from branch conditions as containing undef. The existing comment explains why this is legal. I've checked that alive2 is happy with SCCP tests after this change. Differential Revision: https://reviews.llvm.org/D126647	2022-06-07 10:20:24 +02:00
Enna1	e52a38c8f1	[ASan] Skip any instruction inserted by another instrumentation. Currently, we only check !nosanitize metadata for instruction passed to function `getInterestingMemoryOperands()` or instruction which is a cannot return callable instruction. This patch add this check to any instruction. E.g. ASan shouldn't instrument the instruction inserted by UBSan/pointer-overflow. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D126269	2022-06-07 11:17:07 +08:00
Kevin P. Neal	a1f1bd547b	[IPSCCP] Switch away from Instruction::isSafeToRemove() In D115737 I found that I needed to teach Instruction::isSafeToRemove() about strictfp/constrained intrinsics. It was pointed out that this is probably the wrong function to use isInstructionTriviallyDead(). It doesn't make sense to have a "second, worse implementation". I also believe that the Instruction class is the wrong place for this functionality. The information about whether or not an instruction can be removed is in the transform passes and should stay there. Differential Revision: https://reviews.llvm.org/D118387	2022-06-06 09:24:11 -04:00
Florian Hahn	eaf48dd9b0	[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF. Try to simplify BranchOnCount to `BranchOnCond true` if TC <= UF * VF. This is an alternative to D121899 which simplifies the VPlan directly instead of doing so late in code-gen. The potential benefit of doing this in VPlan is that this may help cost-modeling in the future. The reason this is done in prepareToExecute at the moment is that a single plan may be used for multiple VFs/UFs. There are further simplifications that can be applied as follow ups: 1. Replace inductions with constants 2. Replace vector region with regular block. Fixes #55354. Depends on D126679. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D126680	2022-06-06 09:38:53 +01:00
Kazu Hirata	8daf23d364	[Scalar] Use llvm::make_early_inc_range (NFC)	2022-06-05 23:53:18 -07:00
Sanjay Patel	3f33d67d8a	[InstCombine] fold mul with masked low bit operand to trunc+select https://alive2.llvm.org/ce/z/o7rQ5q This shows an extra instruction in some cases, but that is caused by an existing canonicalization of trunc -> and+icmp. Codegen should be better for any target where a multiply is more costly than the most simple ALU op. This ends up producing the requested x86 asm from issue #55618, but it's not the same IR. We are missing a canonicalization from the negate+mask pattern to the trunc+select created here.	2022-06-05 20:07:18 -04:00

1 2 3 4 5 ...

30700 Commits