llvm-project

Author	SHA1	Message	Date
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Nikita Popov	2c6eec219d	[Tests] Avoid lifetime intrinsics on non-allocas (NFC) Don't rely on auto-upgrade, instead either remove unnecessary casts or remove no longer applicable tests.	2025-07-23 15:05:43 +02:00
Nikita Popov	92c55a315e	[IR] Only allow lifetime.start/end on allocas (#149310 ) lifetime.start and lifetime.end are primarily intended for use on allocas, to enable stack coloring and other liveness optimizations. This is necessary because all (static) allocas are hoisted into the entry block, so lifetime markers are the only way to convey the actual lifetimes. However, lifetime.start and lifetime.end are currently allowed to be used on non-alloca pointers. We don't actually do this in practice, but just the mere fact that this is possible breaks the core purpose of the lifetime markers, which is stack coloring of allocas. Stack coloring can only work correctly if all lifetime markers for an alloca are analyzable. * If a lifetime marker may operate on multiple allocas via a select/phi, we don't know which lifetime actually starts/ends and handle it incorrectly (https://github.com/llvm/llvm-project/issues/104776). * Stack coloring operates on the assumption that all lifetime markers are visible, and not, for example, hidden behind a function call or escaped pointer. It's not possible to change this, as part of the purpose of lifetime markers is that they work even in the presence of escaped pointers, where simple use analysis is insufficient. I don't think there is any way to have coherent semantics for lifetime markers on allocas, while also permitting them on arbitrary pointer values. This PR restricts lifetimes to operate on allocas only. As a followup, I will also drop the size argument, which is superfluous if we always operate on an alloca. (This change also renders various code handling lifetime markers on non-alloca dead. I plan to clean up that kind of code after dropping the size argument as well.) In practice, I've only found a few places that currently produce lifetimes on non-allocas: * CoroEarly replaces the promise alloca with the result of an intrinsic, which will later be replaced back with an alloca. I think this is the only place where there is some legitimate loss of functionality, but I don't think this is particularly important (I don't think we'd expect the promise in a coroutine to admit useful lifetime optimization.) * SafeStack moves unsafe allocas onto a separate frame. We can safely drop lifetimes here, as SafeStack performs its own stack coloring. * Similar for AddressSanitizer, it also moves allocas into separate memory. * LSR sometimes replaces the lifetime argument with a GEP chain of the alloca (where the offsets ultimately cancel out). This is just unnecessary. (Fixed separately in https://github.com/llvm/llvm-project/pull/149492.) * InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast of an alloca. I don't think this is necessary.	2025-07-21 15:04:50 +02:00
Andreas Jonson	0a067dc107	[Attributor] Swap range metadata to attribute for calls. (#108835 )	2025-07-05 16:47:03 +02:00
zGoldthorpe	f393211454	[Reland][IPO] Added attributor for identifying invariant loads (#146584 ) Patched and tested the `AAInvariantLoadPointer` attributor from #141800, which identifies pointers whose loads are eligible to be marked as `!invariant.load`. The bug in the attributor was due to `AAMemoryBehavior` always identifying pointers obtained from `alloca`s as having no writes. I'm not entirely sure why `AAMemoryBehavior` behaves this way, but it seems to be beceause it identifies the scope of an `alloca` to be limited to only that instruction (and, certainly, no memory writes occur within the `alloca` instructin). This patch just adds a check to disallow all loads from `alloca` pointers from being marked `!invariant.load` (since any well-defined program will have to write to stack pointers at some point).	2025-07-01 17:46:19 -04:00
Wenju He	9d570d568b	[ValueTracking] Return true for AddrSpaceCast in canCreateUndefOrPoison (#144686 ) In our downstream GPU target, following IR is valid before instcombine although the second addrspacecast causes UB. define i1 @test(ptr addrspace(1) noundef %v) { %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4) %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0) %2 = icmp eq i32 %1, 0 %3 = addrspacecast ptr addrspace(4) %0 to ptr addrspace(3) %4 = select i1 %2, ptr addrspace(3) null, ptr addrspace(3) %3 %5 = icmp eq ptr addrspace(3) %4, null ret i1 %5 } We have a custom optimization that replaces invalid addrspacecast with poison, and IR is still valid since `select` stops poison propagation. However, instcombine pass optimizes `select` to `or`: %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4) %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0) %2 = icmp eq i32 %1, 0 %3 = addrspacecast ptr addrspace(1) %v to ptr addrspace(3) %4 = icmp eq ptr addrspace(3) %3, null %5 = or i1 %2, %4 ret i1 %5 The transform is invalid for our target. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-06-24 08:43:47 +08:00
zGoldthorpe	00ae89a1cb	Revert "[IPO] Added attributor for identifying invariant loads" (#144808 ) Reverts llvm/llvm-project#141800 The implementation critically misunderstands the `AAMemoryBehavior` attributor, which it relies on heavily. @shiltian, since I do not have commit permissions.	2025-06-18 18:35:01 -04:00
zGoldthorpe	25dcd231bf	[IPO] Added attributor for identifying invariant loads (#141800 ) The attributor conservatively marks pointers whose loads are eligible to be marked as `!invariant.load`. It does so by identifying: 1. Pointers marked `noalias` and `readonly` 2. Pointers whose underlying objects are all eligible for invariant loads. The attributor then manifests this attribute at non-atomic non-volatile load instructions.	2025-06-16 11:16:47 -05:00
Craig Topper	b8a4a3b99c	[ValueTracking] Support scalable vector splats of ConstantInt/ConstantFP in isGuaranteedNotToBeUndefOrPoison. (#142894 ) Scalable vectors use insertelt+shufflevector ConstantExpr to represent a splat.	2025-06-05 22:08:03 -07:00
Craig Topper	d5d6f60632	[ValueTracking] Support scalable vectors for ExtractElement in computeKnownFPClass. (#143051 ) We can support scalable vectors by setting the demanded mask to APInt(1, 1) to demand the whole vector.	2025-06-05 20:48:07 -07:00
Shilei Tian	d2992423e3	[Attributor] Don't replace `addrspacecast (ptr null to ptr addrspace(x))` with `ptr addrspace(x) null` (#126779 ) `ConstantPointerNull` represents a pointer with value 0, but it doesn’t necessarily mean a `nullptr`. `ptr addrspace(x) null` is not the same as `addrspacecast (ptr null to ptr addrspace(x))` if the `nullptr` in AS X is not zero. Therefore, we can't simply replace it. Fixes #115083.	2025-05-20 18:08:42 -04:00
Matt Arsenault	609a8331a0	ValueTracking: Handle minimumnum and maximumnum in computeKnownFPClass (#138737 ) For now use the same treatment as minnum/maxnum, but these should diverge. alive2 seems happy with this, except for some preexisting bugs with weird denormal modes.	2025-05-07 08:02:24 +02:00
Matt Arsenault	03f3f15690	ValueTracking: Add baseline tests for minimumnum/maximumnum (#138736 ) Mostly copied from existing min/max tests, with a few additions.	2025-05-07 07:59:25 +02:00
Matt Arsenault	37b135cc8f	Attributor: Don't rely on use_empty for constants (#137218 ) This allows inferring noalias on a null argument parameter. This avoids a non-NFC diff in a future change.	2025-04-24 21:41:55 +02:00
Nikita Popov	d69ee885cc	[CaptureTracking] Remove dereferenceable_or_null special case (#135613 ) Remove the special case where comparing a dereferenceable_or_null pointer with null results in captures(none) instead of captures(address_is_null). This special case is not entirely correct. Let's say we have an allocated object of size 2 at address 1 and have a pointer `%p` pointing either to address 1 or 2. Then passing `gep p, -1` to a `dereferenceable_or_null(1)` function is well-defined, and allows us to distinguish between the two possible pointers, capturing information about the address. Now that we ignore address captures in alias analysis, I think we're ready to drop this special case. Additionally, if there are regressions in other places, the fact that this is inferred as address_is_null should allow us to easily address them if necessary.	2025-04-17 12:44:57 +02:00
Matt Arsenault	34e8f00066	Attributor: Propagate align to cmpxchg instructions (#134838 ) Fixes #134480	2025-04-08 22:15:50 +07:00
Matt Arsenault	66f0343609	Attributor: Propagate align to atomicrmw instructions (#134837 ) Partially fixes #134480	2025-04-08 22:12:20 +07:00
Matt Arsenault	2cf4254466	Attributor: Add baseline tests for propagating align to atomics (#134836 )	2025-04-08 22:08:11 +07:00
Matt Arsenault	783201b184	Attributor: Don't follow uses of ConstantData (#134573 ) These should not really have uselists, and it's not worth the compile time of looking at all uses of trivial constants. The main observable change of this is it no longer adds align attributes on constant null uses, but those are not useful. Some of these cases should potentially be more aggressive and not look at any Constant users.	2025-04-07 23:59:53 +07:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Pierre van Houtryve	5470dffda2	[Attributor] Do not optimize away externally_initialized loads. (#128170 ) Fixes SWDEV-515029	2025-03-03 14:58:47 +01:00
Nikita Popov	abd97d9685	[CaptureTracking] Take non-willreturn calls into account We can leak one bit of information about the address by either diverging or not. Part of https://github.com/llvm/llvm-project/issues/129090.	2025-02-28 11:15:28 +01:00
Yeaseen	951ba3e9fa	[llvm] Remove undef from some `llvm/test/Transforms` tests (#125460 ) This PR replaces some instances of `undef` with function argument value or poison or concrete values in several tests under `llvm/test/Transforms/` directory.	2025-02-03 08:18:23 +00:00
Nikita Popov	8a43d0e873	[Attributor] Check correct IRPosition in AANoCapture::isImpliedByIR() This case is intended to check the callee argument, not the call-site. Fixes an issue introduced in #123181.	2025-01-29 17:34:10 +01:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Paul Walker	56c091ea71	[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856 ) This brings the printing of scalable vector constant splats inline with their fixed length counterparts.	2024-11-21 11:21:12 +00:00
Lee Wei	58ca7078ce	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#115688 ) This PR aims to remove undefined behavior from tests.	2024-11-12 08:41:27 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
David Green	0f919444ad	[ValueTracking] Handle recursive phis in knownFPClass (#114008 ) As a follow-on to 113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.	2024-11-01 13:38:29 +00:00
David Green	9735c05186	[ValueTracking] Compute KnownFP state from recursive select/phi. (#113686 ) Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from #83200.	2024-10-31 07:50:44 +00:00
David Green	f358422268	[Attributor] Add nofpclass test for phi+select recurrences. NFC	2024-10-30 08:10:35 +00:00
Yingwei Zheng	8d8bb4032b	[Verifier] Verify attribute `denormal-fp-math[-f32]` (#112310 ) Some typos are also fixed. Address https://github.com/llvm/llvm-project/pull/112067#pullrequestreview-2363722447.	2024-10-15 17:32:16 +08:00
Johannes Doerfert	335e137267	[Attributor][FIX] Track returned pointer offsets (#110534 ) If the pointer returned by a function is not "the base pointer" but has an offset, we need to track the offset such that users can apply it to their offset chain when they create accesses. This was reported by @ye-luo and reduced test cases are included. The OffsetInfo was moved and the container was replaced with a set to avoid excessive growth. Otherwise, the patch just replaces the "returns pointer" flag with the "returned offsets", and deals with the applying to offsets at the call site. --------- Co-authored-by: Johannes Doerfert <jdoerfert@llnl.gov>	2024-10-01 12:41:15 -05:00
Shilei Tian	0b7a18bd4a	[Attributor] Use more appropriate approach to check flat address space (#108713 )	2024-09-27 18:26:55 -04:00
Yonghong Song	becc02ce93	Revert "[Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… (#105742 )" This reverts commit 959448fbd6bc6f74fb3f9655b1387d0e8a272ab8. Reverting because multiple test failures e.g. https://lab.llvm.org/buildbot/#/builders/187/builds/1290 https://lab.llvm.org/buildbot/#/builders/153/builds/9389 and maybe a few others.	2024-09-19 03:54:13 -07:00
yonghong-song	959448fbd6	[Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… (#105742 ) …ntElimination ArgumentPromotion and DeadArgumentElimination passes could change function signatures but the function name remains the same as before the transformation. This makes it hard for tracing with bpf programs where user tends to use function signature in the source. See discussion [1] for details. This patch added suffix to functions whose signatures are changed. The suffix lets users know that function signature has changed and they need to impact the IR or binary to find modified signature before tracing those functions. The suffix for ArgumentPromotion is ".argprom" and the suffixes for DeadArgumentElimination are ".argelim" and ".retelim". The suffix also gives user hints about what kind of transformation has been done. With this patch, I built a recent linux kernel with full LTO enabled. I got 4 functions with only argpromotion like ``` set_track_update.argelim.argprom pmd_trans_huge_lock.argprom ... ``` I got 1058 functions with only deadargelim like ``` process_bit0.argelim pci_io_ecs_init.argelim ... ``` I got 3 functions with both argpromotion and deadargelim ``` set_track_update.argelim.argprom zero_pud_populate.argelim.argprom zero_pmd_populate.argelim.argprom ``` [1] https://github.com/llvm/llvm-project/issues/104678	2024-09-19 10:21:58 +02:00
Johannes Doerfert	56a033462e	[Attributor] Keep track of reached returns in AAPointerInfo (#107479 ) Instead of visiting call sites in Attribute::checkForAllUses, we now keep track of returns in AAPointerInfo and use the call site return information as required. This way, the user of AAPointerInfo(CallSite)Argument can determine if the call return should be visited. We do not collect them as "may accesses" in the AAPointerInfo(CallSite)Argument itself in case a return user is found.	2024-09-10 08:13:21 -07:00
Johannes Doerfert	84bf0da34d	[Attributor][FIX] Ensure to always translate call site arguments (#107323 ) When we propagate call site arguments we always need to translate them, this is important as we ended up picking the function argument for a recurisve call not the call site argument. `@recBad` and `@recGood` in `returned.ll` show the problem as they used to transform them the same way. The restructuring cleans the code up and helps derive more "returned" arguments and better information in the presence of recursive calls. The "dropped" attributes are simply dropped because we do not query them anymore, not because we cannot derive them.	2024-09-05 13:37:21 -07:00
Johannes Doerfert	e6dece9f69	[Attributor][FIX] Mark "may" accesses through call sites as such (#107439 ) Before, we kept the call site access kind (may/must) when we translated the access. However, the pointer we access it through (by passing it to the callee) might not be the underlying object. We have similar logic when we add store and load accesses.	2024-09-05 13:33:58 -07:00
Johannes Doerfert	3726f9c575	[Attributor][NFC] Pre-commits for #107439 (#107457 )	2024-09-05 13:10:37 -07:00
Alex MacLean	369d8148e0	[ValueTracking] use KnownBits to compute fpclass from bitcast (#97762 ) When we encounter a bitcast from an integer type we can use the information from `KnownBits` to glean some information about the fpclass: - If the sign bit is known, we can transfer this information over. - If the float is IEEE format and enough of the bits are known, we may be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.	2024-08-30 07:34:49 -07:00
Johannes Doerfert	8266d47cd1	[Attributor] Improve AAUnderlyingObjects (#104835 ) - Allocas and GlobalValues cannot be simplified, so we should not try. - If we never used any assumed state, the AAUnderlyingObjects doesn't require an additional update. - If we have seen an object (or it's underlying object) before, we do not need to inspect it anymore. The original logic for "SeenObjects" was flawed and caused us to add intermediate values to the underlying object list if a PHI or select instruction referenced the same underlying object twice. The test changes are all instances of this situation and we now correctly derive `memory(none)` for the functions that only access stack memory. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2024-08-20 12:05:20 -07:00
Nikita Popov	472c79ca52	[IR] Check that arguments of naked function are not used (#104757 ) Verify that the arguments of a naked function are not used. They can only be referenced via registers/stack in inline asm, not as IR values. Doing so will result in assertion failures in the backend. There's probably more that we should verify, though I'm not completely sure what the constraints are (would it be correct to require that naked functions are exactly an inline asm call + unreachable, or is more allowed?) Fixes https://github.com/llvm/llvm-project/issues/104718.	2024-08-20 09:29:05 +02:00
Shilei Tian	907c7eb311	[Attributor] Enable `AAAddressSpace` in `OpenMPOpt` (#104363 ) This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4. We can finally reland the PR since the issue that caused the PR to be reverted has been resolved in https://github.com/llvm/llvm-project/pull/104051.	2024-08-16 13:33:48 -04:00
Johannes Doerfert	7156bcf286	[Attributor][FIX] Ensure we do not use stale references (#104495 ) When copying map entries, we might run into resizing and invalidate the RHS of the assignment. We dealt with this before and now use the proper helper to avoid the problem in another place. Fixes: https://github.com/llvm/llvm-project/issues/104397	2024-08-15 18:45:36 -04:00
Matt Arsenault	f9060f1b7e	AMDGPU: Fix using wrong alloca address space in test (#102108 )	2024-08-07 00:19:22 +04:00
Shilei Tian	9373a43218	[Attributor] Indicate optimistic fixed point if an instruction already has non-zero address space (#101589 )	2024-08-01 22:55:09 -04:00
Vidush Singhal	c7633ddb28	[Attributor]: Ensure cycle info is not null when handling PHI in AAPointerInfo (#97321 ) Ensure cycle info object is not null for simple PHI case for the test: `llvm/test/Transforms/Attributor/phi_bug_pointer_info.ll` Debug info Before the change: ``` Accesses by bin after update: [8-12] : 1 - 9 - store i32 %0, ptr %field2, align 4 - c: %0 = load i32, ptr %val, align 4 [32-36] : 1 - 9 - store i32 %1, ptr %field8, align 4 - c: %1 = load i32, ptr %val2, align 4 [2147483647-4294967294] : 1 - 6 - %ret = load i32, ptr %x, align 4 - c: <unknown> ``` Debug info After the change: ``` Accesses by bin after update: [8-12] : 2 - 9 - store i32 %0, ptr %field2, align 4 - c: %0 = load i32, ptr %val, align 4 - 6 - %ret = load i32, ptr %x, align 4 - c: <unknown> [32-36] : 2 - 9 - store i32 %1, ptr %field8, align 4 - c: %1 = load i32, ptr %val2, align 4 - 6 - %ret = load i32, ptr %x, align 4 - c: <unknown> ``` Co-authored-by: Vidush Singhal <singhal2@ruby964.llnl.gov>	2024-07-01 17:20:34 -07:00
Fangrui Song	89e8e63f47	[Attributor] Stabilize llvm.assume output Don't rely on the iteration order of DenseSet<StringRef>, which is not guaranteed to be deterministic.	2024-06-19 15:36:46 -07:00
Ethan Luis McDonough	b629d4b912	[Attributor] Prevent infinite loop in AAGlobalValueInfoFloating (#94941 ) Global variables that reference themselves alongside a function that is called indirectly can cause an infinite loop in `AAGlobalValueInfoFloating`. The recursive reference is continually pushed back into the workload, causing the attributor to hang indefinitely.	2024-06-18 09:36:42 -07:00

1 2 3 4 5 ...

773 Commits