llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	a77dedcacb	[InstSimplify][InstCombine][ConstantFold] Move vector div/rem by zero fold to InstCombine (#114280 ) Previously we fold `div/rem X, C` into `poison` if any element of the constant divisor `C` is zero or undef. However, it is incorrect when threading udiv over an vector select: https://alive2.llvm.org/ce/z/3Ninx5 ``` define <2 x i32> @vec_select_udiv_poison(<2 x i1> %x) { %sel = select <2 x i1> %x, <2 x i32> <i32 -1, i32 -1>, <2 x i32> <i32 0, i32 1> %div = udiv <2 x i32> <i32 42, i32 -7>, %sel ret <2 x i32> %div } ``` In this case, `threadBinOpOverSelect` folds `udiv <i32 42, i32 -7>, <i32 -1, i32 -1>` and `udiv <i32 42, i32 -7>, <i32 0, i32 1>` into `zeroinitializer` and `poison`, respectively. One solution is to introduce a new flag indicating that we are threading over a vector select. But it requires to modify both `InstSimplify` and `ConstantFold`. However, this optimization doesn't provide benefits to real-world programs: https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/IR/ConstantFold.cpp.html#L908 https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp.html#L1107 This patch moves the fold into InstCombine to avoid breaking numerous existing tests. Fixes #114191 and #113866 (only poison-safety issue).	2024-11-01 22:56:22 +08:00
Yingwei Zheng	e577f14b67	[InstCombine] Use `m_NotForbidPoison` when folding `(X u< Y) ? -1 : (~X + Y) --> uadd.sat(~X, Y)` (#114345 ) Alive2: https://alive2.llvm.org/ce/z/mTGCo- We cannot reuse `~X` if `m_AllOnes` matches a vector constant with some poison elts. An alternative solution is to create a new not instead of reusing `~X`. But it doesn't worth the effort because we need to add a one-use check. Fixes https://github.com/llvm/llvm-project/issues/113869.	2024-11-01 22:18:44 +08:00
Yingwei Zheng	96b14f2ccb	[Reland][InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114499 ) Relands #114356. Compared to the last version, this patch only merges poison-generating/nsz flags from the select to fix LV regression in `llvm/test/Transforms/PhaseOrdering/AArch64/predicated-reduction.ll`.	2024-11-01 12:22:57 +08:00
c8ef	cf0b6cc711	Revert "[ConstantFold] Fold `tgamma` and `tgammaf` when the input parameter is a constant value." (#114496 ) Reverts llvm/llvm-project#114065	2024-11-01 09:26:11 +08:00
c8ef	1f07f995cc	[ConstantFold] Fold `tgamma` and `tgammaf` when the input parameter is a constant value. (#114065 ) This patch adds support for constant folding for the `tgamma` and `tgammaf` libc functions.	2024-11-01 09:07:55 +08:00
gulfemsavrun	d183dc7c24	Revert "[InstCombine] Fix FMF propagation in `foldSelectIntoOp`" (#114458 ) Reverts llvm/llvm-project#114356 because it caused test failures. https://lab.llvm.org/buildbot/#/builders/190/builds/8601 https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-base-linux-x64/b8732549597609293617/overview	2024-10-31 13:21:52 -07:00
Artem Belevich	8129b6b53b	[NVPTX, InstCombine] instcombine known pointer AS checks. (#114325 ) The change improves the code in general and, as a side effect, avoids crashing on an impossible address space casts guarded by `__isGlobal/__isShared`, which partially fixes https://github.com/llvm/llvm-project/issues/112760 It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code. This is #112964 + a small fix for the crash on unintended argument access which was the root cause to revers the earlier version of the patch.	2024-10-31 09:24:51 -07:00
Yingwei Zheng	cf1963afad	[InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114356 ) Closes https://github.com/llvm/llvm-project/issues/113423.	2024-10-31 23:26:45 +08:00
Artem Belevich	04e876e6c6	Revert "[NVPTX] instcombine known pointer AS checks." (#114319 ) Reverts llvm/llvm-project#112964 Crashes MLIR: https://lab.llvm.org/buildbot/#/builders/138/builds/5665	2024-10-30 15:34:08 -07:00
Artem Belevich	1cecc58c3f	[NVPTX] instcombine known pointer AS checks. (#112964 ) The change improves the code in general and, as a side effect, avoids crashing on an impossible address space casts guarded by `__isGlobal/__isShared`, which partially fixes https://github.com/llvm/llvm-project/issues/112760 It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.	2024-10-30 15:13:06 -07:00
Yingwei Zheng	18311093ab	[InstCombine] Do not fold `shufflevector(select)` if the select condition is a vector (#113993 ) Since `shufflevector` is not element-wise, we cannot do fold it into select when the select condition is a vector. For shufflevector that doesn't change the length, it doesn't crash, but it is still a miscompilation: https://alive2.llvm.org/ce/z/s8saCx Fixes https://github.com/llvm/llvm-project/issues/113986.	2024-10-29 10:39:07 +08:00
David Majnemer	902acde341	[InstCombine] Optimize away certain additions using modular arithmetic We can turn: ``` %add = add i8 %arg, C1 %and = and i8 %add, C2 %cmp = icmp eq i1 %and, C3 ``` into: ``` %and = and i8 %arg, C2 %cmp = icmp eq i1 %and, (C3 - C1) & C2 ``` This is only worth doing if the sequence is the sole user of the addition operation.	2024-10-28 22:51:35 +00:00
Matthias Braun	5903c6af44	InstCombine: Fold shufflevector(select) and shufflevector(phi) (#113746 ) - Transform `shufflevector(select(c, x, y), C)` to `select(c, shufflevector(x, C), shufflevector(y, C))` by re-using the `FoldOpIntoSelect` helper. - Transform `shufflevector(phi(x, y), C)` to `phi(shufflevector(x, C), shufflevector(y, C))` by re-using the `foldOpInotPhi` helper.	2024-10-28 15:35:17 -07:00
Yingwei Zheng	f78610af3f	[InstCombine] Add function attribute `instcombine-no-verify-fixpoint` (#113822 ) This patch introduces a function attribute `instcombine-no-verify-fixpoint` to avoids disabling fix-point verification for unrelated tests in the same file. Address comment https://github.com/llvm/llvm-project/pull/112642#discussion_r1804714387.	2024-10-28 17:45:08 +08:00
Yingwei Zheng	5155c38cee	[InstCombine] Don't check uses of constant exprs (#113684 ) This patch skips constant expressions to avoid iterating over uses on other functions. Fix crash reported in https://github.com/llvm/llvm-project/pull/105510#issuecomment-2437521147.	2024-10-28 15:09:20 +08:00
David Majnemer	5d4a0d54b5	[InstCombine] Teach takeLog2 about right shifts, truncation and bitwise-and We left some easy opportunities for further simplifications. log2(trunc(x)) is simply trunc(log2(x)). This is safe if we know that trunc is NUW because it means that the truncation didn't drop any bits. It is also safe if the caller is OK with zero as a possible answer. log2(x >>u y) is simply `log2(x) - y`. log2(x & y) is a funny one. It comes up when doing something like: ``` unsigned int f(unsigned int x, unsigned int y) { unsigned char a = 1u << x; return y / a; } ``` LLVM would canonicalize this to: ``` %shl = shl nuw i32 1, %x %conv1 = and i32 %shl, 255 %div = udiv i32 %y, %conv1 ``` In cases like these, we can ignore the mask entirely. This is equivalent to `y >> x`.	2024-10-28 05:13:04 +00:00
ssijaric-nv	14db069468	[InstCombine] Fix a cycle when folding fneg(select) with scalable vector types (#112465 ) The two folding operations are causing a cycle for the following case with scalable vector types: define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1> %cond, <vscale x 2 x double> %b) { %1 = select <vscale x 2 x i1> %cond, <vscale x 2 x double> zeroinitializer, <vscale x 2 x double> %b %2 = fneg fast <vscale x 2 x double> %1 ret <vscale x 2 x double> %2 } 1) fold fneg: -(Cond ? C : Y) -> Cond ? -C : -Y 2) fold select: (Cond ? -X : -Y) -> -(Cond ? X : Y) 1) results in the following since '<vscale x 2 x double> zeroinitializer' passes the check for the immediate constant: %.neg = fneg fast <vscale x 2 x double> zeroinitializer %b.neg = fneg fast <vscale x 2 x double> %b %1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double> %.neg, <vscale x 2 x double> %b.neg and so we end up going back and forth between 1) and 2). Attempt to fold scalable vector constants, so that we end up with a splat instead: define <vscale x 2 x double> @test_fneg_select_abs(<vscale x 2 x i1> %cond, <vscale x 2 x double> %b) { %b.neg = fneg fast <vscale x 2 x double> %b %1 = select fast <vscale x 2 x i1> %cond, <vscale x 2 x double> shufflevector (<vscale x 2 x double> insertelement (<vscale x 2 x double> poison, double -0.000000e+00, i64 0), <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x double> %b.neg ret <vscale x 2 x double> %1 }	2024-10-25 10:47:39 -07:00
Noah Goldstein	294726d738	Reapply "[InstCombine] Folding `(icmp eq/ne (and X, -P2), INT_MIN)`" (#111236 ) The underlying issue with msan was fixed by #113200	2024-10-23 09:12:08 -05:00
Alex MacLean	4c1b1f6d21	[NVPTX] Add support for clamped funnel shift intrinsics (#113228 ) Add support for ``llvm.nvvm.fshl.clamp`` and ``llvm.nvvm.fshr.clamp`` intrinsics. These intrinsics are similar to the generic llvm funnel shift, except that the shift value is clamped to the integer width. Currently only ``i32`` is supported and is implemented with the `shf.[rl].clamp.b32` PTX instruction.	2024-10-22 16:39:44 -07:00
Paul Walker	5bb34803a4	[NFC] Migrate tests to use autoupdate for CHECK lines.	2024-10-22 12:55:15 +00:00
c8ef	b90ea5caad	[ConstantFold] Fold `erf` and `erff` when the input parameter is a constant value. (#113079 ) This patch adds support for constant folding for the `erf` and `erff` libc functions.	2024-10-22 12:58:11 +08:00
Jake Egan	900b6369e2	[AIX][test] XFAIL constant folding log1p test Test added by commit 47a6da2d4dc7d996eb2678243ac566822d59e483 fails on the AIX bot. So XFAIL for now to investigate further.	2024-10-21 11:27:15 -04:00
XChy	a2ba438f3e	[InstCombine] Preserve the flag from RHS only if the `and` is bitwise (#113164 ) Fixes #113123 Alive proof: https://alive2.llvm.org/ce/z/hnqeLC	2024-10-21 22:30:31 +08:00
c8ef	1336e3d0b9	[ConstantFold] Fold `ilogb` and `ilogbf` when the input parameter is a constant value. (#113014 ) This patch adds support for constant folding for the `ilogb` and `ilogbf` libc functions.	2024-10-20 10:46:35 +08:00
Ramkumar Ramachandra	7b65971e1f	InstCombine: sink loads with invariant.load metadata (#112692 )	2024-10-18 10:35:56 +01:00
Danila Malyutin	1a609052b6	[AArch64][InstCombine] Eliminate redundant barrier intrinsics (#112023 ) If there are no memory ops on the path from one dmb to another then one barrier can be eliminated.	2024-10-17 21:04:04 +04:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Nikita Popov	d9cd607200	[InstCombine] Add tests for #110919 (NFC)	2024-10-17 14:57:38 +02:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Yingwei Zheng	aad3a1630e	[ValueTracking] Respect `samesign` flag in `isKnownInversion` (#112390 ) In https://github.com/llvm/llvm-project/pull/93591 we introduced `isKnownInversion` and assumes `X` is poison implies `Y` is poison because they share common operands. But after introducing `samesign` this assumption no longer hold if `X` is an icmp has `samesign` flag. Alive2 link: https://alive2.llvm.org/ce/z/rj3EwQ (Please run it locally with this patch and https://github.com/AliveToolkit/alive2/pull/1098). This approach is the most conservative way in my mind to address this problem. If `X` has `samesign` flag, it will check if `Y` also has this flag and make sure constant RHS operands have the same sign. Fixes https://github.com/llvm/llvm-project/issues/112350.	2024-10-17 00:27:21 +08:00
Ramkumar Ramachandra	682fa797b7	InstCombine/Select: remove redundant code (NFC) (#112388 ) InstCombinerImpl::foldSelectInstWithICmp has some inlined code for select-icmp-xor simplification, but this simplification is already done by other code, via another path: (X & Y) == 0 ? X : X ^ Y -> ((X & Y) == 0 ? 0 : Y) ^ X -> (X & Y) ^ X -> X & ~Y Cover the cases that it claims to simplify, and demonstrate that stripping it doesn't cause test changes.	2024-10-16 12:44:09 +01:00
Yingwei Zheng	0936195311	[InstCombine] Drop `samesign` in InstCombine (#112480 ) Closes https://github.com/llvm/llvm-project/issues/112476.	2024-10-16 19:13:52 +08:00
Yingwei Zheng	3bf2295ee0	[InstCombine] Drop `samesign` flag in `foldAndOrOfICmpsWithConstEq` (#112489 ) In `5dbfca30c1` we assume that RHS is poison implies LHS is also poison. It doesn't hold after introducing samesign flag. This patch drops the `samesign` flag on RHS if the original expression is a logical and/or. Closes #112467.	2024-10-16 16:24:44 +08:00
Alexey Bader	583fa4f5b7	[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088 ) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic.	2024-10-15 22:05:16 +04:00
c8ef	47a6da2d4d	[ConstantFold] Fold `log1p` and `log1pf` when the input parameter is a constant value. (#112113 ) This patch adds support for constant folding for the `log1p` and `log1pf` libc functions.	2024-10-16 00:19:26 +08:00
Yingwei Zheng	9b7491e866	[IR] Add support for `samesign` in `Operator::hasPoisonGeneratingFlags` (#112358 ) Fix https://github.com/llvm/llvm-project/issues/112356.	2024-10-15 23:07:16 +08:00
Ramkumar Ramachandra	1c6c850937	InstCombine: extend select-equiv to support vectors (#111966 ) foldSelectEquivalence currently doesn't support GVN-like replacements on vector types. Put in the checks for potentially lane-crossing operations, and lift the limitation.	2024-10-15 11:10:45 +01:00
Ramkumar Ramachandra	fe526ae99b	InstCombine/test: cover foldSelectValueEquivalence (#111694 ) Write dedicated tests for foldSelectValueEquivalence, demonstrating that it does not perform many GVN-like replacements when: - the comparison is a vector-type - the comparison is a floating-point type as a prelude to fixing these deficiencies.	2024-10-15 10:33:03 +01:00
Yingwei Zheng	8d8bb4032b	[Verifier] Verify attribute `denormal-fp-math[-f32]` (#112310 ) Some typos are also fixed. Address https://github.com/llvm/llvm-project/pull/112067#pullrequestreview-2363722447.	2024-10-15 17:32:16 +08:00
elhewaty	9efb07f261	[IR] Add `samesign` flag to icmp instruction (#111419 ) Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423	2024-10-15 17:11:25 +08:00
Ramkumar Ramachandra	bdf241cab3	ValueTracking: handle more ops in isNotCrossLaneOperation (#112183 ) Reuse llvm::isTriviallyVectorizable in llvm::isNotCrossLaneOperation, in order to get it to handle more intrinsics. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/XSV_GT	2024-10-14 14:08:12 +01:00
Yingwei Zheng	9edc454ee6	[InstCombine] Drop range attributes in `foldIsPowerOf2OrZero` (#112178 ) Closes https://github.com/llvm/llvm-project/issues/112078.	2024-10-14 20:52:55 +08:00
Ramkumar Ramachandra	c5f82f7893	ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011 ) Factor out and unify common code from InstSimplify and InstCombine that partially guard against cross-lane vector operations into llvm::isNotCrossLaneOperation in ValueTracking. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka	2024-10-14 11:37:30 +01:00
Yingwei Zheng	966bee739c	[InstCombine][NFC] Fix typo in is_fpclass.ll (#112067 ) This typo causes alive2 to crash.	2024-10-12 11:06:25 +08:00
Yingwei Zheng	6a65e98fa7	[InstCombine] Drop range attributes in `foldIsPowerOf2` (#111946 ) Fixes https://github.com/llvm/llvm-project/issues/111934.	2024-10-11 18:19:21 +08:00
braw-lee	3645c64d87	[SimplifyLibCalls] fdim constant fold (#109235 ) 2nd PR to fix #108695 based on #108702 --------- Signed-off-by: Kushal Pal <kushalpal109@gmail.com>	2024-10-10 14:44:39 +04:00
David Green	5184d763c7	[InstCombine] Convert @log to @llvm.log if the input is known positive. (#111428 ) Similar to 112aac4e8961b9626bb84f36deeaa5a674f03f5a, this converts log libcalls to llvm.log.f64 intrinsics if we know they do not set errno, as the input is not zero and not negative. As log will produce errno if the input is 0 (returning -inf) or if the input is negative (returning nan), we also perform the conversion when we have noinf and nonan.	2024-10-10 09:54:25 +01:00
c8ef	923566a67d	[ConstantFold] Fold `logb` and `logbf` when the input parameter is a constant value. (#111232 ) This patch adds support for constant folding for the `logb` and `logbf` libc functions.	2024-10-10 07:56:16 +08:00
David Green	587f31fb28	[InstCombine] Add a test for converting log to an intrinsic. NFC	2024-10-09 09:25:13 +01:00
Matt Arsenault	a8e1311a1c	[RFC] IR: Define noalias.addrspace metadata (#102461 ) This is intended to solve a problem with lowering atomics in OpenMP and C++ common to AMDGPU and NVPTX. In OpenCL and CUDA, it is undefined behavior for an atomic instruction to modify an object in thread private memory. In OpenMP, it is defined. Correspondingly, the hardware does not handle this correctly. For AMDGPU, 32-bit atomics work and 64-bit atomics are silently dropped. We therefore need to codegen this by inserting a runtime address space check, performing the private case without atomics, and fallback to issuing the real atomic otherwise. This metadata allows us to avoid this extra check and branch. Handle this by introducing metadata intended to be applied to atomicrmw, indicating they cannot access the forbidden address space.	2024-10-07 23:21:42 +04:00

1 2 3 4 5 ...

8995 Commits