llvm-project

Author	SHA1	Message	Date
Justin Fargnoli	58de8f2c25	[Inliner] Add option (default off) to inline all calls regardless of the cost (#152365 ) Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations. For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.	2025-08-18 17:48:49 +00:00
Panagiotis Karouzakis	c2e7fad446	[DemandedBits] Support non-constant shift amounts (#148880 ) This patch adds support for the shift operators to handle non-constant shift operands. ashr proof -->https://alive2.llvm.org/ce/z/EN-siK lshr proof --> https://alive2.llvm.org/ce/z/eeGzyB shl proof --> https://alive2.llvm.org/ce/z/dpvbkq	2025-08-19 01:11:16 +08:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Andreas Jonson	0561ff6a12	[LVI] Add support for trunc nuw range. (#154021 ) Proof: https://alive2.llvm.org/ce/z/a5Yjb8	2025-08-17 20:24:09 +02:00
Florian Hahn	36be0bba2a	[SCEV] Check if predicate is known false for predicated AddRecs. (#151134 ) Similarly to https://github.com/llvm/llvm-project/pull/131538, we can also try and check if a predicate is known to wrap given the backedge taken count. For now, this just checks directly when we try to create predicated AddRecs. This both helps to avoid spending compile-time on optimizations where we know the predicate is false, and can also help to allow additional vectorization (e.g. by deciding to scalarize memory accesses when otherwise we would try to create a predicated AddRec with a predicate that's always false). The initial version is quite restricted, but can be extended in follow-ups to cover more cases. PR: https://github.com/llvm/llvm-project/pull/151134	2025-08-15 09:30:25 +01:00
Jasmine Tang	10d9e7b1b7	Reapply "[WebAssembly] Constant fold wasm.dot" (#153070 ) In #149619, for the test of `@dot_follow_modulo_spec_2`, constant folding the addition of two i32 1073741824 causes an overflow from 2^32 to -2^32=-2147483648, which triggers the UB sanitizer. This PR reapplies the previous PR, explicitly casting the addition operand to int64_t first before performing the addition before producing a int32 number via `Constant *C = get(cast<IntegerType>(Ty->getScalarType()), V, isSigned)`	2025-08-14 18:52:35 -07:00
joaosaffran	d56fa96524	[DirectX] Add Range Overlap validation (#152229 ) As part of the Root Signature Spec, we need to validate if Root Signatures are not defining overlapping ranges. Closes: https://github.com/llvm/llvm-project/issues/126645 --------- Co-authored-by: joaosaffran <joao.saffran@microsoft.com> Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com> Co-authored-by: Joao Saffran <jderezende@microsoft.com>	2025-08-14 18:40:11 -04:00
Michael Berg	334a046a3c	[LoopDist] Consider reads and writes together for runtime checks (#145623 ) Emit safety guards for ptr accesses when cross partition loads exist which have a corresponding store to the same address in a different partition. This will emit the necessary ptr checks for these accesses. The test case was obtained from SuperTest, which SiFive runs regularly. We enabled LoopDistribution by default in our downstream compiler, this change was part of that enablement.	2025-08-14 12:50:17 -07:00
Elvis Wang	01fac67e2a	[TTI] Add cost kind to getAddressComputationCost(). NFC. (#153342 ) This patch add cost kind to `getAddressComputationCost()` for #149955. Note that this patch also remove all the default value in `getAddressComputationCost()`.	2025-08-14 16:01:44 +08:00
Ryotaro Kasuga	bf6796fa8f	[DA] Extract duplicated logic from exactSIVtest and exactRDIVtest (NFC) (#152712 ) This patch refactors `exactSIVtest` and `exactRDIVtest` by consolidating duplicated logic into a single function. Same as #152688, the main goal is to improve code maintainability, since extra validation logic (as written in TODO comments) may be necessary.	2025-08-13 17:45:28 +09:00
Ryotaro Kasuga	bce0f9d2bf	[DA] Extract duplicated logic from gcdMIVtest (NFCI) (#152688 ) This patch refactors `gcdMIVtest` by consolidating duplicated logic into a single function. The main goal of this change is to improve code maintainability rather than readability, especially since we may need to revise this logic for correctness (as noted in the added TODO comments). I hope this patch is NFC, but I've also added several new assertions, which may cause some previously passing cases to fail.	2025-08-13 15:07:50 +09:00
Sam Tebbs	0bfa1718af	[LV] Create in-loop sub reductions (#147026 ) This PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Stacked PRs: 1. -> https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-08-12 10:22:41 +01:00
Helena Kotas	5165a6c197	[HLSL] Update DXIL resource metadata code to support resource arrays (#152254 ) Closes #145422	2025-08-11 14:55:54 -07:00
Luke Lau	acb86fb9e0	[TTI] Consistently pass the pointer type to getAddressComputationCost. NFCI (#152657 ) In some places we were passing the type of value being accessed, in other cases we were passing the type of the pointer for the access. The most "involved" user is LoopVectorizationCostModel::getMemInstScalarizationCost, which is the only call site that passes in the SCEV, and it passes along the pointer type. This changes call sites to consistently pass the pointer type, and renames the arguments to clarify this. No target actually checks the contents of the type passed, only to see if it's a vector or not, so this shouldn't have an effect.	2025-08-11 18:00:12 +08:00
Sushant Gokhale	e8918c318e	[SCEV] Consider non-volatile memory intrinsics as not having side-effect for forward progress (#150916 ) For the attached test: Before the loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev. We try to consider non-volatile memory intrinsics as not having side-effect for forward progress to fix the issue. Fixes #149377	2025-08-11 00:24:50 -07:00
Yingwei Zheng	2242e28671	[Analysis] Remove an unreachable check. NFC. (#152874 ) Binops never produce pointer values.	2025-08-10 14:43:40 +08:00
weiguozhi	5e87792200	[LoopInfo] Pointer to stack object may not be loop invariant in a coroutine function (#149936 ) A coroutine function may be split to ramp function and resume function, and they have different stack frames, so a pointer to stack objects may have different addresses depending on where it is used, so it's not a loop invariant. It temporarily fixes https://github.com/llvm/llvm-project/issues/149604.	2025-08-09 14:20:19 -07:00
Alexander Richardson	3cf7262876	[CaptureTracking] Handle ptrtoaddr Unlike ptrtoint, ptrtoaddr does not capture provenance, only the address. Note: As defined by the LangRef, we always treat `ptrtoaddr` as a location-independent address capture since it is a direct inspection of the pointer address. Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/152221	2025-08-08 14:22:42 -07:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Ivan R. Ivanov	7c141e2118	[ValueTracking] Add missing check for two-value PN recurrence matching (#152700 ) When InstTy is a type like IntrinsicInst which can have a variable number of arguments, we can encounter a case where Operation will have fewer than two arguments and error at the getOperand() calls. Fixes: https://github.com/llvm/llvm-project/issues/152725.	2025-08-08 17:39:24 +02:00
Ryotaro Kasuga	bd39ae6125	[Delinearization] Add function for fixed size array without relying on GEP (#145050 ) The existing functions `getIndexExpressionsFromGEP` and `tryDelinearizeFixedSizeImpl` provide functionality to delinearize memory accesses for fixed size array. They use the GEP source element type in their optimization heuristics. However, driving optimization heuristics based on GEP type information is not allowed. This patch introduces new functions `findFixedSizeArrayDimensions` and `delinearizeFixedSizeArray` to delinearize a fixed size array without using the type information in GEP. The new function `findFixedSizeArrayDimensions` infers the size of each dimension of the array based on the value to be added to the address as induction variables are incremented. `delinearizeFixedSizeArray` attempts to restore the subscripts of each dimension based on the estimated array size. This is an initial implementation that may not cover all cases, but is intended to replace the existing function in the future. Related: - https://discourse.llvm.org/t/enabling-loop-interchange/82589/4 - https://github.com/llvm/llvm-project/pull/124911#issuecomment-2962499501	2025-08-08 19:08:14 +09:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Ryotaro Kasuga	05dd957cda	[DA] Fix the check between Subscript and Size after delinearization (#151326 ) Delinearization provides two values: the size of the array, and the subscript of the access. DA checks their validity (`0 <= subscript < size`), with some special handling. In particular, to ensure `subscript < size`, calculate the maximum value of `subscript - size` and check if it is negative. There was an issue in its process: when `subscript - size` is expressed as an affine format like `init + step * i`, the value in the last iteration (`start + step * (num_iterations - 1)`) was assumed to be the maximum value. This assumption is incorrect in the following cases: - When `step` is negative - When the AddRec wraps This patch introduces extra checks to ensure the sign of `step` and verify the existence of nsw/nuw flags. Also, `isKnownNonNegative(S - smax(1, Size))` was used as a regular check, which is incorrect when `Size` is negative. This patch also replace it with `isKnownNonNegative(S - Size)`, although it's still unclear whether using `isKnownNonNegative` is appropriate in the first place. Fix #150604	2025-08-08 10:58:13 +09:00
Ramkumar Ramachandra	edeee824f0	Reland [VectorUtils] Trivially vectorize ldexp, [l]lround (#152476 ) Changes: The original patch, landed as 1336675, was reverted due to a bug in LoopVectorize resulting in a crash. The bug has now been fixed by 95c32bf ([VPlan] Return invalid cost if any skeleton block has invalid costs), and this reland is identical to the original patch.	2025-08-07 12:07:29 +01:00
Michael Kruse	04196ba01a	[DA][NFC] clang-format DependenceAnalysis (#151505 ) To avoid noise in PRs such as in #146383.	2025-08-07 11:44:25 +02:00
Andrew Lazarev	f61526971f	Revert "[WebAssembly] Constant fold wasm.dot" (#152382 ) Reverts llvm/llvm-project#149619 It breaks ubsan bot: https://lab.llvm.org/buildbot/#/builders/25/builds/10523 Earlier today the failure was hidden by another breakage that is fixed now.	2025-08-06 15:16:19 -07:00
Jasmine Tang	9c6bb18040	[WebAssembly] Constant fold wasm.dot (#149619 ) Constant fold wasm.dot of constant vectors/splats. Test case added in `llvm/test/Transforms/InstSimplify/ConstProp/WebAssembly/dot.ll` Related to https://github.com/llvm/llvm-project/issues/55933	2025-08-05 15:22:37 -07:00
Pedro Lobo	2bbc614713	[InstCombine] Support offsets in `memset` to load forwarding (#151924 ) Adds support for load offsets when performing `memset` load forwarding.	2025-08-05 17:09:06 +01:00
Nikita Popov	c1b387e23d	[MemoryLocation] Compute lifetime size from alloca size (#151982 ) Split out from #150248: Since #150944 the size passed to lifetime.start/end is considered meaningless. The lifetime always applies to the whole alloca. This adjusts MemoryLocation to determine the MemoryLocation size from the alloca size, instead of using the argument.	2025-08-05 10:47:07 +02:00
Nikita Popov	ba099c516d	[StackLifetime] Remove handling for lifetime size mismatch (#151965 ) Split out from #150248: Since #150944 the size passed to lifetime.start/end is considered meaningless. The lifetime always applies to the whole alloca. Accordingly remove handling for size mismatch in the StackLifetime analysis.	2025-08-05 09:19:10 +02:00
Nikita Popov	4b5b36e5c4	[GVN] Avoid creating lifetime of non-alloca There is a larger problem here in that we should not be performing arbitrary pointer replacements for assumes. This is handled for branches, but assume goes through a different code path. Fixes https://github.com/llvm/llvm-project/issues/151785.	2025-08-04 12:06:40 +02:00
Abhishek Kaushik	30728eb26b	[Reland][ValueTracking] Improve Bitcast handling to match SDAG (#145223 ) Fixes #125228 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 14:51:03 +05:30
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Florian Hahn	2ae996cbbe	[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047 ) This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047	2025-08-01 14:18:07 +01:00
Lewis Crawford	5146917407	[ConstantFolding] Fix incorrect nvvm_round folding (#151563 ) The `nvvm_round` intrinsic should round to the nearest even number in the case of ties. It lowers to PTX `cvt.rni`, which will "round to nearest integer, choosing even integer if source is equidistant between two integers", so it matches the semantics of `rint` (and not `round` as the name suggests).	2025-08-01 10:31:43 +01:00
Muhammad Omair Javaid	176d54aa33	Revert "[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545 )" This reverts commit 13366759c3b9db9366659d870cc73c938422b020. This broke various LLVM testsuite buildbots for AArch64 SVE, but the problem got masked because relevant buildbots were already failing due to other breakage. It has broken llvm-test-suite test: gfortran-regression-compile-regression__vect__pr106253_f.test https://lab.llvm.org/buildbot/#/builders/4/builds/8164 https://lab.llvm.org/buildbot/#/builders/17/builds/9858 https://lab.llvm.org/buildbot/#/builders/41/builds/8067 https://lab.llvm.org/buildbot/#/builders/143/builds/9607	2025-08-01 01:24:52 +05:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
Justin Bogner	3f066f5fcf	[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633 ) We extract the binding logic out of the DXILResource analysis passes into the FrontendHLSL library. This will allow us to use this logic for resource and root signature bindings in both the DirectX backend and the HLSL frontend.	2025-07-31 08:35:47 -07:00
Nathan Gauër	67273393b1	[VectorCombine][TTI] Prevent extract/ins rewrite to GEP (#150216 ) Using GEP to index into a vector is not disallowed, but not recommended. The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to #145002	2025-07-31 14:14:00 +02:00
Florian Hahn	ab9b23c446	[SCEV] Use pattern match to check ZExt(Add()). (NFC) Follow-up to https://github.com/llvm/llvm-project/pull/151227#pullrequestreview-3074670031 to check the inner expression is an Add before calling getTruncateExpr. Adds a new matcher that just matches and captures SCEVAddExpr, to support matching a SCEVAddExpr with arbitrary number of operands.	2025-07-31 12:47:14 +01:00
Mel Chen	6752415ce8	[VectorUtils] Simplify the code by new function InterleaveGroup::isFull. nfc (#151112 )	2025-07-31 16:02:53 +08:00
Florian Hahn	d74d841b65	[SECV] Try to push the op into ZExt: A + zext (-A + B) -> zext (B) (#151227 ) Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap. The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return. This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset. Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands. https://alive2.llvm.org/ce/z/q7d303 PR: https://github.com/llvm/llvm-project/pull/151227	2025-07-30 21:10:57 +01:00
Lewis Crawford	c5327b935b	[ConstantFolding] Fix typo in GetNVVMDenormMode (#151297 ) Fix typo in function name of GetNVVMDenormMode (Denrom vs Denorm).	2025-07-30 10:48:09 +01:00
Abhinav Garg	f527b319e3	[Uniformity Analysis] Fix print method to dump uniformity info (#151130 )	2025-07-30 10:57:57 +05:30
Ramkumar Ramachandra	13366759c3	[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545 )	2025-07-29 19:23:09 +01:00
Paul Walker	1528ddbe76	[ConstantFolding][SVE] Do not fold fcmp of denormal without known mode. (#150614 ) This is a follow on to https://github.com/llvm/llvm-project/pull/115407 that introduced code which bypasses the splat handling for scalable vectors. To maintain existing tests I have moved the early return until after the splat handling so all vector types are treated equally.	2025-07-29 12:37:59 +01:00
David Sherwood	6fbc397964	[IR] Add new CreateVectorInterleave interface (#150931 ) This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.	2025-07-29 08:47:07 +01:00
Shoreshen	a5deb59dfe	[AMDGPU] Add NoaliasAddrSpace to AAMDnodes (#149247 ) This is the following PR of https://github.com/llvm/llvm-project/pull/136553 which calculate NoaliasAddrSpace. This PR carries the info calculated into MIR by adding it into AAMDnodes	2025-07-29 10:10:06 +08:00
Kazu Hirata	c7cd1d0ae3	[Analysis] Remove an unnecessary cast (NFC) (#150838 ) getOpcode() already returns Instruction::CastOps.	2025-07-27 10:43:30 -07:00

1 2 3 4 5 ...

14306 Commits