llvm-project

Author	SHA1	Message	Date
Shoreshen	00ee53cc7b	[Attributor] Propagate alignment through ptrmask (#150158 ) Propagate alignment through ptrmask based on potential constant values of mask and align of ptr. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-11-04 12:26:17 +08:00
Kazu Hirata	707bab651f	[llvm] Remove redundant typename (NFC) (#166087 ) Identified with readability-redundant-typename.	2025-11-02 13:15:16 -08:00
Kazu Hirata	042ac912b1	[llvm] Add "override" where appropriate (NFC) (#165168 ) Note that "override" makes "virtual" redundant. Identified with modernize-use-override.	2025-10-26 13:34:32 -07:00
Kazu Hirata	ae78957112	[Support] Rename CTLog2 to ConstantLog2 in MathExtras.h (#158006 ) This patch renames CTLog2 to ConstantLog2 for readability. This patch provides a forwarder under LLVM_DEPRECATED because CTLog2 is used downstream.	2025-09-11 07:54:27 -07:00
Philip Reames	e6b4a21849	[IR] Add utilities for manipulating length of MemIntrinsic [nfc] (#153856 ) Goal is simply to reduce direct usage of getLength and setLength so that if we end up moving memset.pattern (whose length is in elements) there are fewer places to audit.	2025-08-20 13:50:11 -07:00
Kazu Hirata	228e96b28a	[llvm] Use std::make_optional (NFC) (#151627 ) std::make_optional<T> is a lot like std::make_unique<T> in that it performs perfect forwarding of arguments for T's constructor. As a result, we don't have to repeat type names twice.	2025-08-01 00:24:40 -07:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
Shoreshen	181b014c06	Attributor: Infer noalias.addrspace metadata for memory instructions (#136553 ) Add noalias.addrspace metadata for store, load and atomic instruction in AMDGPU backend.	2025-07-08 09:50:31 +08:00
Andreas Jonson	0a067dc107	[Attributor] Swap range metadata to attribute for calls. (#108835 )	2025-07-05 16:47:03 +02:00
zGoldthorpe	f393211454	[Reland][IPO] Added attributor for identifying invariant loads (#146584 ) Patched and tested the `AAInvariantLoadPointer` attributor from #141800, which identifies pointers whose loads are eligible to be marked as `!invariant.load`. The bug in the attributor was due to `AAMemoryBehavior` always identifying pointers obtained from `alloca`s as having no writes. I'm not entirely sure why `AAMemoryBehavior` behaves this way, but it seems to be beceause it identifies the scope of an `alloca` to be limited to only that instruction (and, certainly, no memory writes occur within the `alloca` instructin). This patch just adds a check to disallow all loads from `alloca` pointers from being marked `!invariant.load` (since any well-defined program will have to write to stack pointers at some point).	2025-07-01 17:46:19 -04:00
zGoldthorpe	00ae89a1cb	Revert "[IPO] Added attributor for identifying invariant loads" (#144808 ) Reverts llvm/llvm-project#141800 The implementation critically misunderstands the `AAMemoryBehavior` attributor, which it relies on heavily. @shiltian, since I do not have commit permissions.	2025-06-18 18:35:01 -04:00
zGoldthorpe	25dcd231bf	[IPO] Added attributor for identifying invariant loads (#141800 ) The attributor conservatively marks pointers whose loads are eligible to be marked as `!invariant.load`. It does so by identifying: 1. Pointers marked `noalias` and `readonly` 2. Pointers whose underlying objects are all eligible for invariant loads. The attributor then manifests this attribute at non-atomic non-volatile load instructions.	2025-06-16 11:16:47 -05:00
Shilei Tian	f32b75658f	[Attributor] Use known non-flat AS before `getAssumedAddrSpace` (#143221 ) If the underlying object already has a non-flat address space, we simply use that before calling `getAssumedAddrSpace`. Partially fixes SWDEV-536263.	2025-06-09 10:11:34 -04:00
Kazu Hirata	54d836a080	[llvm] Use *Set::insert_range (NFC) (#138237 )	2025-06-02 19:48:13 -07:00
Shilei Tian	4d48673562	Reapply "Reapply "[AMDGPU] Make `getAssumedAddrSpace` return AS1 for pointer kernel arguments (#137488 )"" This reverts commit 37ea3b32cdcb6c0dcecbcc4bf844f5190c7378dd.	2025-05-30 22:11:22 -04:00
Shilei Tian	37ea3b32cd	Revert "Reapply "[AMDGPU] Make `getAssumedAddrSpace` return AS1 for pointer kernel arguments (#137488 )"" This reverts commit 4efc13f8ff1eaf4f9fb1fcea8d4552b3eca052ca.	2025-05-30 22:06:16 -04:00
Shilei Tian	4efc13f8ff	Reapply "[AMDGPU] Make `getAssumedAddrSpace` return AS1 for pointer kernel arguments (#137488 )" This reverts commit 3c6211c183885afb5d89259a53c4f4f46a6bf399.	2025-05-30 21:56:24 -04:00
Shilei Tian	3c6211c183	Revert "[AMDGPU] Make `getAssumedAddrSpace` return AS1 for pointer kernel arguments (#137488 )" This reverts commit 9bf6b2a8cb0467b62173659306e43a0346f063a2.	2025-05-30 21:15:25 -04:00
Shilei Tian	9bf6b2a8cb	[AMDGPU] Make `getAssumedAddrSpace` return AS1 for pointer kernel arguments (#137488 )	2025-05-30 17:30:42 -04:00
Alex MacLean	3a84a4e55d	Reland "[NVPTX] Unify and extend barrier{.cta} intrinsic support" (#141143 ) Note: This relands #140615 adding a ".count" suffix to the non-".all" variants. Our current intrinsic support for barrier intrinsics is confusing and incomplete, with multiple intrinsics mapping to the same instruction and intrinsic names not clearly conveying intrinsic semantics. Further, we lack support for some variants. This change unifies the IR representation to a single consistently named set of intrinsics. - llvm.nvvm.barrier.cta.sync.aligned.all(i32) - llvm.nvvm.barrier.cta.sync.aligned.count(i32, i32) - llvm.nvvm.barrier.cta.arrive.aligned.count(i32, i32) - llvm.nvvm.barrier.cta.sync.all(i32) - llvm.nvvm.barrier.cta.sync.count(i32, i32) - llvm.nvvm.barrier.cta.arrive.count(i32, i32) The following Auto-Upgrade rules are used to maintain compatibility with IR using the legacy intrinsics: * llvm.nvvm.barrier0 --> llvm.nvvm.barrier.cta.sync.aligned.all(0) * llvm.nvvm.barrier.n --> llvm.nvvm.barrier.cta.sync.aligned.all(x) * llvm.nvvm.bar.sync --> llvm.nvvm.barrier.cta.sync.aligned.all(x) * llvm.nvvm.barrier --> llvm.nvvm.barrier.cta.sync.aligned.count(x, y) * llvm.nvvm.barrier.sync --> llvm.nvvm.barrier.cta.sync.all(x) * llvm.nvvm.barrier.sync.cnt --> llvm.nvvm.barrier.cta.sync.count(x, y)	2025-05-22 19:38:10 -07:00
Alex Maclean	e72d8b2553	Revert "[NVPTX] Unify and extend barrier{.cta} intrinsic support (#140615 )" This reverts commit 735209c0688b10a66c24750422b35d8c2ad01bb5.	2025-05-22 17:28:43 +00:00
Alex MacLean	735209c068	[NVPTX] Unify and extend barrier{.cta} intrinsic support (#140615 ) Our current intrinsic support for barrier intrinsics is confusing and incomplete, with multiple intrinsics mapping to the same instruction and intrinsic names not clearly conveying intrinsic semantics. Further, we lack support for some variants. This change unifies the IR representation to a single consistently named set of intrinsics. - llvm.nvvm.barrier.cta.sync.aligned.all(i32) - llvm.nvvm.barrier.cta.sync.aligned(i32, i32) - llvm.nvvm.barrier.cta.arrive.aligned(i32, i32) - llvm.nvvm.barrier.cta.sync.all(i32) - llvm.nvvm.barrier.cta.sync(i32, i32) - llvm.nvvm.barrier.cta.arrive(i32, i32) The following Auto-Upgrade rules are used to maintain compatibility with IR using the legacy intrinsics: * llvm.nvvm.barrier0 --> llvm.nvvm.barrier.cta.sync.aligned.all(0) * llvm.nvvm.barrier.n --> llvm.nvvm.barrier.cta.sync.aligned.all(x) * llvm.nvvm.bar.sync --> llvm.nvvm.barrier.cta.sync.aligned.all(x) * llvm.nvvm.barrier --> llvm.nvvm.barrier.cta.sync.aligned(x, y) * llvm.nvvm.barrier.sync --> llvm.nvvm.barrier.cta.sync.all(x) * llvm.nvvm.barrier.sync.cnt --> llvm.nvvm.barrier.cta.sync(x, y)	2025-05-21 08:14:15 -07:00
Kazu Hirata	1ecba5bd62	[llvm] Use std::tie to implement operator< (NFC) (#139487 )	2025-05-11 21:28:47 -07:00
Kazu Hirata	2e230f5685	[llvm] Use llvm::interleaved (NFC) (#137496 )	2025-04-26 23:28:46 -07:00
Matt Arsenault	37b135cc8f	Attributor: Don't rely on use_empty for constants (#137218 ) This allows inferring noalias on a null argument parameter. This avoids a non-NFC diff in a future change.	2025-04-24 21:41:55 +02:00
Nikita Popov	d69ee885cc	[CaptureTracking] Remove dereferenceable_or_null special case (#135613 ) Remove the special case where comparing a dereferenceable_or_null pointer with null results in captures(none) instead of captures(address_is_null). This special case is not entirely correct. Let's say we have an allocated object of size 2 at address 1 and have a pointer `%p` pointing either to address 1 or 2. Then passing `gep p, -1` to a `dereferenceable_or_null(1)` function is well-defined, and allows us to distinguish between the two possible pointers, capturing information about the address. Now that we ignore address captures in alias analysis, I think we're ready to drop this special case. Additionally, if there are regressions in other places, the fact that this is inferred as address_is_null should allow us to easily address them if necessary.	2025-04-17 12:44:57 +02:00
Matt Arsenault	34e8f00066	Attributor: Propagate align to cmpxchg instructions (#134838 ) Fixes #134480	2025-04-08 22:15:50 +07:00
Matt Arsenault	66f0343609	Attributor: Propagate align to atomicrmw instructions (#134837 ) Partially fixes #134480	2025-04-08 22:12:20 +07:00
Matt Arsenault	783201b184	Attributor: Don't follow uses of ConstantData (#134573 ) These should not really have uselists, and it's not worth the compile time of looking at all uses of trivial constants. The main observable change of this is it no longer adds align attributes on constant null uses, but those are not useful. Some of these cases should potentially be more aggressive and not look at any Constant users.	2025-04-07 23:59:53 +07:00
Tim Gymnich	049f179606	[Analysis][NFC] Extract KnownFPClass (#133457 ) - extract KnownFPClass for future use inside of GISelKnownBits --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-03-28 18:10:02 +01:00
Kazu Hirata	0dcc201ac4	[Transforms] Use *Set::insert_range (NFC) (#132056 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-19 15:35:01 -07:00
Kazu Hirata	8789c0083d	[Transforms] Avoid repeated hash lookups (NFC) (#131554 )	2025-03-17 07:42:21 -07:00
Johannes Doerfert	9f28621fae	[Attributor][NFC] Clang format (#129163 )	2025-02-27 23:59:08 -05:00
Nikita Popov	e56a6a2683	Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) (#128020 ) Relative to the previous attempt this includes two fixes: * Adjust callCapturesBefore() to not skip captures(ret: address, provenance) arguments, as these will not count as a capture at the call-site. * When visiting uses during stack slot optimization, don't skip the ModRef check for passthru captures. Calls can both modref and be passthru for captures. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-27 09:38:29 +01:00
Nico Weber	e2ba1b6ffd	Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 )" This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729. Seems to break LTO builds of clang on Windows, see comments on https://github.com/llvm/llvm-project/pull/125880	2025-02-19 11:32:57 -05:00
Nikita Popov	7e3735d1a1	Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) Relative to the previous attempt, this adjusts isEscapeSource() to not treat calls with captures(ret: address, provenance) or similar arguments as escape sources. This addresses the miscompile reported at: https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577 The implementation uses a helper function on CallBase to make this check a bit more efficient (e.g. by skipping the byval checks) as checking attributes on all arguments if fairly expensive. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-14 12:38:04 +01:00
Nikita Popov	1e64ea9914	Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 )" This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a. A miscompilation has been reported at: https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577	2025-02-13 14:56:12 +01:00
Nikita Popov	ee655ca27a	[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-13 09:36:35 +01:00
Nikita Popov	8a43d0e873	[Attributor] Check correct IRPosition in AANoCapture::isImpliedByIR() This case is intended to check the callee argument, not the call-site. Fixes an issue introduced in #123181.	2025-01-29 17:34:10 +01:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Mats Jun Larsen	416f1c465d	[IR] Replace of PointerType::get(Type) with opaque version (NFC) (#123617 ) In accordance with https://github.com/llvm/llvm-project/issues/123569 In order to keep the patch at reasonable size, this PR only covers for the llvm subproject, unittests excluded.	2025-01-21 00:32:56 +09:00
macurtis-amd	d1a6eaa478	[Attributor][NFC] Performance improvements (#122923 ) ` forallInterferingAccesses` is a hotspot and for large modules these changes make a measurable improvement in compilation time. For LTO kernel compilation of 519.clvleaf (SPEChpc 2021) I measured the following: ``` \| Measured times (s) \| Average \| speedup --------------------+------------------------+---------+--------- Baseline \| 33.268 33.332 33.275 \| 33.292 \| 0% Cache "kernel" \| 30.543 30.339 30.607 \| 30.496 \| 9.2% templatize callback \| 30.981 30.97 30.964 \| 30.972 \| 7.5% Both changes \| 29.284 29.201 29.053 \| 29.179 \| 14.1% ```	2025-01-14 12:51:25 -06:00
Jay Foad	f8559751fc	[llvm-project] Fix typo "propogate" (#114795 )	2024-11-04 15:33:19 +00:00
Kazu Hirata	98ea1a81a2	[IPO] Remove unused includes (NFC) (#114716 ) Identified with misc-include-cleaner.	2024-11-03 13:48:55 -08:00
Shilei Tian	5a74a4a667	[Attributor] Take the address space from addrspacecast directly (#108258 ) Currently `AAAddressSpace` relies on identifying the address spaces of all underlying objects. However, it might infer sub-optimal address space when the underlying object is a function argument. In `AMDGPUPromoteKernelArgumentsPass`, the promotion of a pointer kernel argument is by adding a series of `addrspacecast` instructions (as shown below), and hoping `InferAddressSpacePass` can pick it up and do the rewriting accordingly. Before promotion: ``` define amdgpu_kernel void @kernel(ptr %to_be_promoted) { %val = load i32, ptr %to_be_promoted ... ret void } ``` After promotion: ``` define amdgpu_kernel void @kernel(ptr %to_be_promoted) { %ptr.cast.0 = addrspace cast ptr % to_be_promoted to ptr addrspace(1) %ptr.cast.1 = addrspace cast ptr addrspace(1) %ptr.cast.0 to ptr # all the use of %to_be_promoted will use %ptr.cast.1 %val = load i32, ptr %ptr.cast.1 ... ret void } ``` When `AAAddressSpace` analyzes the code after promotion, it will take `%to_be_promoted` as the underlying object of `%ptr.cast.1`, and use its address space (which is 0) as its final address space, thus simply do nothing in `manifest`. The attributor framework will them eliminate the address space cast from 0 to 1 and back to 0, and replace `%ptr.cast.1` with `%to_be_promoted`, which basically reverts all changes by `AMDGPUPromoteKernelArgumentsPass`. IMHO I'm not sure if `AMDGPUPromoteKernelArgumentsPass` promotes the argument in a proper way. To improve the handling of this case, this PR adds an extra handling when iterating over all underlying objects. If an underlying object is a function argument, it means it reaches a terminal such that we can't futher deduce its underlying object further. In this case, we check all uses of the argument. If they are all `addrspacecast` instructions and their destination address spaces are same, we take the destination address space. Fixes: SWDEV-482640.	2024-10-09 22:51:07 -04:00
Johannes Doerfert	335e137267	[Attributor][FIX] Track returned pointer offsets (#110534 ) If the pointer returned by a function is not "the base pointer" but has an offset, we need to track the offset such that users can apply it to their offset chain when they create accesses. This was reported by @ye-luo and reduced test cases are included. The OffsetInfo was moved and the container was replaced with a set to avoid excessive growth. Otherwise, the patch just replaces the "returns pointer" flag with the "returned offsets", and deals with the applying to offsets at the call site. --------- Co-authored-by: Johannes Doerfert <jdoerfert@llnl.gov>	2024-10-01 12:41:15 -05:00
Jeremy Morse	96f37ae453	[NFC] Use initial-stack-allocations for more data structures (#110544 ) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.	2024-09-30 23:15:18 +01:00
Shilei Tian	0b7a18bd4a	[Attributor] Use more appropriate approach to check flat address space (#108713 )	2024-09-27 18:26:55 -04:00
macurtis-amd	72fd35b85b	[Attributor] Report change when updating ReachesReturn (#108965 )	2024-09-19 11:10:18 -05:00

1 2 3 4 5 ...

638 Commits