llvm-project

Author	SHA1	Message	Date
Antonio Frighetto	e977b28c37	[InstCombine] Match intrinsic recurrences when known to be hoisted For value-accumulating recurrences of kind: ``` %umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ] %umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b) ``` The binary intrinsic may be simplified into an intrinsic with init value and the other operand, if the latter is loop-invariant: ``` %umax = call i8 @llvm.umax.i8(i8 %a, i8 %b) ``` Proofs: https://alive2.llvm.org/ce/z/ea2cVC. Fixes: https://github.com/llvm/llvm-project/issues/145875.	2025-08-08 09:31:50 +02:00
Nikita Popov	16d73839b1	[InstCombine] Support folding intrinsics into phis (#151115 ) Call foldOpIntoPhi() for speculatable intrinsics. We already do this for FoldOpIntoSelect(). Among other things, this partially subsumes https://github.com/llvm/llvm-project/pull/149858.	2025-07-31 12:32:37 +02:00
Pedro Lobo	d9952a7a5f	[InstCombine] Propagate neg `nsw` when folding `abs(-x)` to `abs(x)` (#150460 ) We can propagate the nsw in the neg to abs, as `-x` is only poison if x == INT_MIN.	2025-07-24 21:45:37 +01:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Jeremy Morse	5b8c15c6e7	[DebugInfo] Remove getPrevNonDebugInstruction (#148859 ) With the advent of intrinsic-less debug-info, we no longer need to scatter calls to getPrevNonDebugInstruction around the codebase. Remove most of them -- there are one or two that have the "SkipPseudoOp" flag turned on, however they don't seem to be in positions where skipping anything would be reasonable.	2025-07-16 11:41:32 +01:00
Ahmed Bougacha	77bcab835a	[InstCombine] Combine ptrauth intrin. callee into same-key bundle. (#94707 ) Try to optimize a call to the result of a ptrauth intrinsic, potentially into the ptrauth call bundle: call(ptrauth.resign(p)), ["ptrauth"()] -> call p, ["ptrauth"()] call(ptrauth.sign(p)), ["ptrauth"()] -> call p as long as the key/discriminator are the same in sign and auth-bundle, and we don't change the key in the bundle (to a potentially-invalid key.) Generating a plain call to a raw unauthenticated pointer is generally undesirable, but if we ended up seeing a naked ptrauth.sign in the first place, we already have suspicious code. Unauthenticated calls are also easier to spot than naked signs, so let the indirect call shine. Note that there is an arguably unsafe extension to this, where we don't bother checking that the key in bundle and intrinsic are the same (and also allow folding away an auth into a bundle.) This can end up generating calls with a bundle that has an invalid key (which an informed frontend wouldn't have otherwise done), which can be problematic. The C that generates that is straightforward but arguably unreasonable. That wouldn't be an issue if we were to bite the bullet and make these fully AArch64-specific, allowing key knowledge to be embedded here.	2025-07-15 14:39:53 -07:00
Ahmed Bougacha	42d2ae1034	[InstCombine] Combine ptrauth constant callee into bundle. (#94706 ) Try to optimize a call to a ptrauth constant, into its ptrauth bundle: call(ptrauth(f)), ["ptrauth"()] -> call f as long as the key/discriminator are the same in constant and bundle.	2025-07-15 13:37:07 -07:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
Luke Lau	04c614327c	[InstCombine] Pull vector reverse through intrinsics (#146384 ) This is the intrinsic version of #146349, and handles fabs as well as other intrinsics. It's largely a copy of InstCombinerImpl::foldShuffledIntrinsicOperands but a bit simpler since we don't need to find a common mask. Creating a separate function seems to be cleaner than trying to shoehorn it into the existing one.	2025-07-01 16:49:10 +01:00
Luke Lau	6f7370ced6	[InstCombine] Pull vector reverse through fneg (#146349 ) This follows on from https://github.com/llvm/llvm-project/pull/144933#issuecomment-2992372627, and allows us to remove the reverse (fneg (reverse x)) combine. A separate patch will handle the case for fabs. I haven't checked if we perform this canonicalization for either unops or binops for vp.reverse	2025-06-30 16:15:04 +01:00
AZero13	8c77191835	[InstCombine] smin(smax(X, -1), 1) -> scmp(X, 0) and smax(smin(X, 1), -1) -> scmp(X, 0) (#145736 ) Motivating case: https://godbolt.org/z/Wxcc51jcj Alive2: https://alive2.llvm.org/ce/z/-bPPAg	2025-06-30 15:44:37 +02:00
Luke Lau	d0c1ea928c	[InstCombine] Pull unary shuffles through fneg/fabs (#144933 ) This canonicalizes fneg/fabs (shuffle X, poison, mask) -> shuffle (fneg/fabs X), posion, mask This undoes part of b331a7ebc1e02f9939d1a4a1509e7eb6cdda3d38 and a8f13dbdeb31be37ee15b5febb7cc2137bbece67, but keeps the binary shuffle case i.e. shuffle fneg, fneg, mask. By pulling out the shuffle we bring it inline with the same canonicalisation we perform on binary ops and intrinsics, which the original commit acknowledges it goes in the opposite direction. However nowadays VectorCombine is more powerful and can do more optimisations when the shuffle is pulled out, so I think we should revisit this. In particular we get more shuffles folded and can perform scalarization.	2025-06-30 10:40:12 +01:00
Philip Reames	b5aaf9d988	[InstCombine] Implement vp.reverse reordering/elimination through binop/unop (#143963 ) This simply copies the structure of the vector.reverse patterns from just above, and reimplements them for the vp.reverse intrinsics when the mask is all ones and the EVLs exactly match. Its unfortunate that we have three different ways to represent a reverse (shuffle, vector.reverse, and vp.reverse) but I don't see an obvious way to remove any them because the semantics are slightly different. This significantly improves vectorization in TSVC_2's s112 and s1112 loops when using EVL tail folding.	2025-06-18 08:53:45 -07:00
Philip Reames	25781221d6	[instcombine] Delete dead transform for reverse of binop (#143967 ) We canonicalize reverse to after a binop in foldVectorBinop, and simplify reverse pairs in InstSimplify, so these elimination transforms are redundant.	2025-06-16 12:43:13 -07:00
Luke Lau	b81d5e06c7	[InstCombine] Fold shuffles through all trivially vectorizable intrinsics (#141979 ) This addresses a TODO in foldShuffledIntrinsicOperands to use isTriviallyVectorizable instead of a hardcoded list of intrinsics, which in turn allows more intriniscs to be scalarized by VectorCombine. From what I can tell every intrinsic here should be speculatable so an assertion was added. Because this enables intrinsics like abs which have a scalar operand, we need to also check isVectorIntrinsicWithScalarOpAtArg.	2025-06-13 18:25:07 +01:00
Ricardo Jesus	c70c0a86a5	[AArch64][InstCombine] Combine AES instructions with zero operands. (#142781 ) We currently combine (AES (EOR (A, B)), 0) into (AES A, B) for Neon intrinsics when the zero operand appears in the RHS of the AES instruction. This patch extends the combine to support AES SVE intrinsics and the case where the zero operand appears in the LHS of the AES instructions.	2025-06-09 08:27:58 +01:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
Luke Lau	618a399f86	[InstCombine] Explicitly match poison operand. NFCI (#141744 ) This is a follow up from https://github.com/llvm/llvm-project/pull/141300#discussion_r2109109224	2025-05-28 13:16:55 +01:00
Luke Lau	9262e37d8c	[InstCombine] Fold shuffled intrinsic operands with constant operands (#141300 ) We currently pull shuffles through binops and intrinsics, which is an important canonical form for VectorCombine to be able to scalarize vector sequences. But while binops can be folded with a constant operand, intrinsics currently require all operands to be shufflevectors. This extends intrinsic folding to be in line with regular binops by reusing the constant "unshuffling" logic. As far as I can tell the list of currently folded intrinsics don't require any special UB handling. This change in combination with #138095 and #137823 fixes the following C: ```c void max(int x, int y, int n) { for (int i = 0; i < n; i++) x[i] += y > 42 ? y : 42; } ``` Not using the splatted vector form on RISC-V with `-O3 -march=rva23u64`: ```asm vmv.s.x v8, a4 li a4, 42 vmax.vx v10, v8, a4 vrgather.vi v8, v10, 0 .LBB0_9: # %vector.body # =>This Inner Loop Header: Depth=1 vl2re32.v v10, (a5) vadd.vv v10, v10, v8 vs2r.v v10, (a5) ``` i.e., it now generates ```asm li a6, 42 max a6, a4, a6 .LBB0_9: # %vector.body # =>This Inner Loop Header: Depth=1 vl2re32.v v8, (a5) vadd.vx v8, v8, a6 vs2r.v v8, (a5) ```	2025-05-28 10:57:08 +01:00
Yingwei Zheng	8d0ebd823c	[InstCombine] Special handle `va_copy(dst, src) + va_end(src)` (#140250 ) Closes https://github.com/llvm/llvm-project/issues/140215.	2025-05-17 16:04:03 +08:00
Iris Shi	f9783c559f	[InstCombine] Fix `frexp(frexp(x)) -> frexp(x)` fold (#138837 ) Fixes #138819 When frexp is applied twice, the second result should be zero.	2025-05-08 00:37:46 +08:00
sallto	1ee9576ee7	[InstCombine] Funnel shift with negative amount folds to funnel shift in opposite direction (#138334 ) (#138763 ) Partially `fixes` #138334. Combine fshl(X,X,Neg(Y)) into fshr(X,X,Y) and fshr(X,X,Neg(Y)) into fshl(X,X,Y)	2025-05-07 15:50:49 +02:00
Philip Reames	650dca5d89	[IR] Remove the AtomicMemInst helper classes (#138710 ) Migrate their usage to the `AnyMemInst` family, and add a isAtomic() query on the base class for that hierarchy. This matches the idioms we use for e.g. isAtomic on load, store, etc.. instructions, the existing isVolatile idioms on mem* routines, and allows us to more easily share code between atomic and non-atomic variants. As with #138568, the goal here is to simplify the class hierarchy and make it easier to reason about. I'm moving from easiest to hardest, and will stop at some point when I hit "good enough". Longer term, I'd sorta like to merge or reverse the naming on the plain MemInst and the AnyMemInst, but that's a much larger and more risky change. Not sure I'm going to actually do that.	2025-05-06 14:24:40 -07:00
Philip Reames	0b8528e127	[instcombine] Adjust style of MemIntrinsic code to be more idiomatic [nfc] (#138715 ) Use an existing helper function. Remove the use of a local Changed variable which doesn't seem to interact with surrounding transforms in any meaningful way. (Both memcpy and memmove are MemTransfer instructions, so switching from one to the other doesn't change results.) Posted for review mostly for a sanity check that I'm not missing something with the logic around the Change flag.	2025-05-06 13:19:53 -07:00
Iris Shi	6db447f824	[InstCombine] Canonicalize `max(min(X, MinC), MaxC) -> min(max(X, MaxC), MinC)` (#136665 ) Closes #121870. https://alive2.llvm.org/ce/z/WjmAjz https://alive2.llvm.org/ce/z/4KCjgL	2025-04-23 16:31:50 +08:00
Tim Gymnich	049f179606	[Analysis][NFC] Extract KnownFPClass (#133457 ) - extract KnownFPClass for future use inside of GISelKnownBits --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-03-28 18:10:02 +01:00
Iris	1762f16f6c	[InstCombine] Fold `umax/umin(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umax/umin(x, y))` and `umax/umin(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umax/umin(x, y), z)` (#131076 ) - Closes #129947 This PR introduces the following transformations: 1. `umax(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umax(x, y))` 2. `umin(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umin(x, y))` 3. `umax(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umax(x, y),z)` 4. `umin(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umin(x, y),z)` Alive2 live proof: - https://alive2.llvm.org/ce/z/6bM-p7 for 1 and 2 - https://alive2.llvm.org/ce/z/aqLRYA and https://alive2.llvm.org/ce/z/twoVhb for 3 and 4 repectively	2025-03-15 13:40:35 +08:00
Yingwei Zheng	2ebc69a521	[InstCombine] Add support for GEPs in `simplifyNonNullOperand` (#128365 ) Alive2: https://alive2.llvm.org/ce/z/2KE8zG	2025-02-23 17:19:31 +08:00
Yingwei Zheng	126016b662	[InstCombine] Simplify nonnull pointers (#128111 ) This patch is the follow-up of https://github.com/llvm/llvm-project/pull/127979. It introduces a helper `simplifyNonNullOperand` to avoid duplicate logic. It also addresses the one-use issue in `visitLoadInst`, as discussed in https://github.com/llvm/llvm-project/pull/127979#issuecomment-2671013972. The `nonnull` attribute is also supported. Proof: https://alive2.llvm.org/ce/z/MCKgT9	2025-02-22 15:30:04 +08:00
Yingwei Zheng	cf37ae5cae	[InstCombine] Add one-use check when folding fabs over selects (#122270 ) Fixes multi-use issue introduced by https://github.com/llvm/llvm-project/pull/86390. It allows the folding of `fabs (select Cond, TrueC, FalseC)` to avoid performance regression in ocio	2025-01-29 21:44:59 +08:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Kazu Hirata	7fa1936947	[InstCombine] Avoid repeated hash lookups (NFC) (#123559 )	2025-01-20 10:14:18 -08:00
Ruhung	9c7e02d579	[InstCombine] Fold umax(nuw_mul(x, C0), x + 1) into (x == 0 ? 1 : nuw_mul(x, C0)) (#123468 ) This PR introduces the following transformations: - If C0 is not 0: umax(nuw_shl(x, C0), x + 1) -> x == 0 ? 1 : nuw_shl(x, C0) - If C0 is not 0 or 1: umax(nuw_mul(x, C0), x + 1) -> x == 0 ? 1 : nuw_mul(x, C0) Fixes #122388. Alive2 proof: https://alive2.llvm.org/ce/z/rkp_8U	2025-01-20 16:32:35 +01:00
Yingwei Zheng	94fee13d42	[InstCombine] Simplify FMF propagation. NFC. (#121899 ) This patch uses new FMF interfaces introduced by https://github.com/llvm/llvm-project/pull/121657 to simplify existing code with `andIRFlags` and `copyFastMathFlags`.	2025-01-17 01:31:06 +08:00
Andreas Jonson	2570e354f1	[InstCombine] Handle trunc to i1 in align assume. (#122949 ) proof: https://alive2.llvm.org/ce/z/EyAUA4	2025-01-15 17:39:01 +01:00
goldsteinn	3318a7248a	[InstCombine] Fold `(ct{t,l}z Pow2)` -> `Log2(Pow2)` (#122620 ) - [InstCombine] Add tests for folding `(ct{t,l}z Pow2)`; NFC - [InstCombine] Fold `(ct{t,l}z Pow2)` -> `Log2(Pow2)` Do so we can find `Log2(Pow2)` for "free" with `takeLog2` https://alive2.llvm.org/ce/z/CL77fo	2025-01-13 09:38:09 -06:00
Amr Hesham	642e493d4d	[InstCombine] Convert fshl(x, 0, y) to shl(x, and(y, BitWidth - 1)) when BitWidth is pow2 (#122362 ) Convert `fshl(x, 0, y)` to `shl(x, and(y, BitWidth - 1))` when BitWidth is pow2 Alive2 proof: https://alive2.llvm.org/ce/z/3oTEop Fixes: #122235	2025-01-11 11:48:05 +01:00
Yingwei Zheng	d80bdf7261	[IRBuilder] Add a helper function to intersect FMFs from two instructions (#122059 ) Address review comment in https://github.com/llvm/llvm-project/pull/121899#discussion_r1905765776	2025-01-09 14:36:42 +08:00
Yingwei Zheng	c05599966c	[InstCombine] Fix FMF propagation in `copysign Mag, (copysign ?, X) -> copysign Mag, X` (#121574 ) Closes https://github.com/llvm/llvm-project/issues/121432.	2025-01-06 16:23:46 +08:00
Yingwei Zheng	2adcec7780	[InstCombine] Simplify with.overflow intrinsics with assumption information (#84016 ) This patch recognizes never-overflow assumptions generated by rustc to improve the codegen. Please refer to https://github.com/rust-lang/hashbrown/issues/509 for more details. Closes https://github.com/rust-lang/hashbrown/issues/509 Closes https://github.com/llvm/llvm-project/issues/80637	2025-01-05 21:15:17 +08:00
David Green	18abc7e0c5	[PatternMatch] Introduce m_c_Select (#114328 ) This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).	2024-11-25 13:47:23 +00:00
Yingwei Zheng	a59976bea8	[InstCombine] Drop noundef attributes in `foldCttzCtlz` (#116718 ) Closes https://github.com/llvm/llvm-project/issues/112068.	2024-11-19 20:06:34 +08:00
Jorge Botto	fcd51dee42	[InstCombine] Factorise Add and Min/Max using distributivity (#101717 ) This PR fixes part of https://github.com/llvm/llvm-project/issues/92433. It specifically adds the 4 cases mentioned in https://github.com/llvm/llvm-project/issues/92433#issuecomment-2117064459. I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.	2024-11-02 17:08:12 +01:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Ramkumar Ramachandra	c5f82f7893	ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011 ) Factor out and unify common code from InstSimplify and InstCombine that partially guard against cross-lane vector operations into llvm::isNotCrossLaneOperation in ValueTracking. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka	2024-10-14 11:37:30 +01:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
David Green	c0e308ba3d	[InstCombine] Pass DomTree and DomTreeCacheto LibCallSimplifier (#108446 ) This allows any combines to pick up Known states from dominating conditions.	2024-09-13 08:36:48 +01:00
Volodymyr Vasylkun	d163935585	[InstCombine] Fold `scmp(x -nsw y, 0)` to `scmp(x, y)` (#105583 ) Proof: https://alive2.llvm.org/ce/z/v6VtXz	2024-08-22 14:18:48 +01:00

1 2 3 4 5 ...

1113 Commits