llvm-project

Author	SHA1	Message	Date
Ramkumar Ramachandra	45817aa726	LICM: hoist BO assoc for and, or, xor (#111146 ) Trivially lift the Opcode limitation on hoistBOAssociation to also hoist and, or, and xor. Alive2 proofs: https://alive2.llvm.org/ce/z/rVNP2X	2024-10-04 19:13:51 +01:00
Ramkumar Ramachandra	6fe723441b	LICM: hoist BO assoc for FAdd and FMul (#108415 ) Extend hoistBOAssociation to the FAdd and FMul cases, noting that we copy an intersection of the fast-math flags present in both instructions.	2024-09-27 11:05:30 +01:00
Nikita Popov	13b4d1bfea	[SimplifyCFG][LICM] Add additional speculation tests These are related to https://github.com/llvm/llvm-project/issues/108854.	2024-09-18 14:48:58 +02:00
Ramkumar Ramachandra	16900d3b98	LICM: hoist BO assoc when BinOp is in RHS (#107072 ) Extend hoistBOAssociation smoothly to handle the case when the inner BinaryOperator is in the RHS of the outer BinaryOperator. This completes the generalization of hoistBOAssociation, and the only limitation after this patch is the fact that only Add and Mul are hoisted.	2024-09-04 22:01:04 +01:00
Ramkumar Ramachandra	5818337765	LICM: hoist BO assoc when (C1 op LV) op C2 (#106999 ) Extend hoistBOAssociation to handle the "(C1 op LV) op C2" case, when op is a commutative operand.	2024-09-04 11:47:37 +01:00
Ramkumar Ramachandra	f1ef67ded5	LICM: extend hoist BO assoc to mul case (#106991 ) Trivially extend hoistBOAssociation to also handle the BinaryOperator Mul. Alive2 proofs: https://alive2.llvm.org/ce/z/zjtR5g	2024-09-03 17:08:11 +01:00
Ramkumar Ramachandra	3b6e255c83	LICM/test: regen a test with UTC (NFC) (#107117 )	2024-09-03 16:00:44 +01:00
Ramkumar Ramachandra	05f5a91d00	LICM: use IRBuilder in hoist BO assoc (#106978 ) Use IRBuilder when creating the new invariant instruction, so that the constant-folder has an opportunity to constant-fold the new Instruction that we desire to create.	2024-09-03 15:27:03 +01:00
Ramkumar Ramachandra	2a8fda443e	LICM: extend hoistAddSub to unsigned case (#106373 ) Trivially extend dd0cf23 ([LICM] Reassociate & hoist sub expressions) to handle unsigned predicates as well. Alive2 proofs: https://alive2.llvm.org/ce/z/GdDBtT.	2024-08-30 14:12:52 +01:00
Nikita Popov	37a94b7edd	[LICM][MustExec] Make must-exec logic for IV condition commutative (#93150 ) MustExec has special logic to determine whether the first loop iteration will always be executed, by simplifying the IV comparison with the start value. Currently, this code assumes that the IV is on the LHS of the comparison, but this is not guaranteed. Make sure it handles the commuted variant as well. The changed PhaseOrdering test previously performed peeling to make the loads dereferenceable -- as a side effect, this also reduced the exit count by one, avoiding the awkward <= MAX case. Now we know up-front the the loads are dereferenceable and can be simply hoisted. As such, we retain the original exit count and now have to handle it by widening the exit count calculation to i128. This is a regression, but at least it preserves the vectorization, which was the original goal. I'm not sure what else can be done about that test.	2024-08-08 16:31:20 +02:00
Ricardo Jesus	fc157522c5	[LICM] Prevent fold and hoist of binary ops with over 2 uses (#102114 ) This limits folding and hoisting associative binary ops to cases where the intermediate op has at most two uses. The more uses the intermediate op has, the more new ops we have to create to potentially reduce the loop's critical path. We keep the limit to two uses to minimise undesirable increases in code size.	2024-08-07 09:52:30 +01:00
Ricardo Jesus	25da8e5a97	Reapply "[LICM] Fold associative binary ops to promote code hoisting (#81608 )" (#100377 ) This reapplies a more strict version of `f2ccf80136`. Perform the transformation "(LV op C1) op C2" ==> "LV op (C1 op C2)" where op is an associative binary op, LV is a loop variant, and C1 and C2 are loop invariants, and hoist (C1 op C2) into the preheader. For now this fold is restricted to ADDs.	2024-07-26 10:12:25 +01:00
Nikita Popov	b48819dbcd	Revert " [LICM] Fold associative binary ops to promote code hoisting (#81608 )" This reverts commit f2ccf80136a01ca69f766becafb329db6c54c0c8. The flag propagation code is incorrect.	2024-07-23 12:01:22 +02:00
Ricardo Jesus	f2ccf80136	[LICM] Fold associative binary ops to promote code hoisting (#81608 ) Perform the transformation "(LV op C1) op C2" ==> "LV op (C1 op C2)" where op is an associative binary op, LV is a loop variant, and C1 and C2 are loop invariants to hoist. Similar patterns could be folded (left in comment) but this one seems to be the most impactful.	2024-07-23 10:03:26 +01:00
Tim Gymnich	0dd43774a6	[LICM] Fix dropped metadata (#95221 ) LICM drops metadata for call instructions when cloning instructions. This patch just adds the missing `copyMetadata`. Fixes #91919.	2024-06-19 10:22:52 +02:00
Ruiling, Song	a6eddf9a79	[Loads] Pass DominatorTree if available (#95752 ) For better dominance check inside the function.	2024-06-18 15:53:28 +08:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Antonio Frighetto	70091dc943	[LICM] Invalidate cached SCEV results in `hoistMulAddAssociation` While reassociating expressions, LICM is required to invalidate SCEV results, as otherwise subsequent passes in the pipeline that leverage LICM foldings (e.g. IndVars), may reason on invalid expressions; thus miscompiling. This is achieved by rewriting the reassociable instruction from scratch. Fixes: https://github.com/llvm/llvm-project/issues/91957.	2024-05-29 08:44:45 +02:00
Antonio Frighetto	c2a9a974ca	[LICM] Introduce test for PR92655 (NFC)	2024-05-29 08:44:45 +02:00
Alex Voicu	10edb4991c	[Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182 ) At the moment, Clang is rather liberal in assuming that 0 (and by extension unqualified) is always a safe default. This does not work for targets that actually use a different value for the default / generic AS (for example, the SPIRV that obtains from HIPSPV or SYCL). This patch is a first, fairly safe step towards trying to clear things up by querying a modules' default AS from the target, rather than assuming it's 0, alongside fixing a few places where things break / we encode the 0 == DefaultAS assumption. A bunch of existing tests are extended to check for non-zero default AS usage.	2024-05-19 14:59:03 +01:00
Nikita Popov	b7adba8e78	[LICM] Add must exec hoisting test with commuted operands (NFC)	2024-05-14 12:29:20 +09:00
Nikita Popov	3a25e358e2	[LICM] Generate test checks (NFC)	2024-05-14 12:26:15 +09:00
Shan Huang	cdd782183d	[DebugInfo][LICM] Fix missing debug location updates (#91729 )	2024-05-11 16:26:04 +01:00
Craig Topper	1261c02be4	[LICM] Drop nsw/nuw flags on affected instructions in hoistMulAddAssociation. (#85486 ) Since we are introducing new multiplies earlier in the arithmetic, the nsw/nuw flags on later instructions are no longer accurate. Fixes #85457.	2024-03-18 11:46:25 -07:00
Craig Topper	2dd5204681	Recommit "[LICM] Support integer mul/add in hoistFPAssociation. (#67736 )" With a fix for build bot failure. I was accessing the type of a deleted Instruction. Original message: The reassociation this is trying to repair can happen for integer types too. This patch adds support for integer mul/add to hoistFPAssociation. The function has been renamed to hoistMulAddAssociation. I've used separate statistics and limits for integer to allow tuning flexibility.	2024-02-12 20:33:28 -08:00
Fangrui Song	3d18c8cd26	[test] Replace aarch64-*-{eabi,gnueabi}{,hf} with aarch64 Similar to d39b4ce3ce8a3c256e01bdec2b140777a332a633 Using "eabi" or "gnueabi" for aarch64 targets is a common mistake and warned by Clang Driver. We want to avoid them elsewhere as well. Just use the common "aarch64" without other triple components.	2024-02-12 18:29:55 -08:00
Craig Topper	ecd63afafd	Revert "[LICM] Support integer mul/add in hoistFPAssociation. (#67736 )" This reverts commit 7ff5dfbaa0c971048da0f37ec6f05f5395562c21. Causing crashes on Mac build bots.	2024-02-12 16:41:29 -08:00
Craig Topper	7ff5dfbaa0	[LICM] Support integer mul/add in hoistFPAssociation. (#67736 ) The reassociation this is trying to repair can happen for integer types too. This patch adds support for integer mul/add to hoistFPAssociation. The function has been renamed to hoistMulAddAssociation. I've used separate statistics and limits for integer to allow tuning flexibility.	2024-02-12 14:59:49 -08:00
paperchalice	e390c229a4	[Pass] Add hyphen to some pass names (#74287 ) Here is the list of the renamed passes: - `callbrprepare` -> `callbr-prepare` - `dwarfehprepare` -> `dwarf-eh-prepare` - `flattencfg` -> `flatten-cfg` - `loweratomic` -> `lower-atomic` - `lowerinvoke` -> `lower-invoke` - `lowerswitch` -> `lower-switch` - `winehprepare` -> `win-eh-prepare` - `targetir` -> `target-ir` - `targetlibinfo` -> `target-lib-info` Legacy passes are not affected.	2024-01-25 16:05:54 +08:00
Bruno De Fraine	656bf13004	[AST] Don't merge memory locations in AliasSetTracker (#65731 ) This changes the AliasSetTracker to track memory locations instead of pointers in its alias sets. The motivation for this is outlined in an RFC posted on LLVM discourse: https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336 In the data structures of the AST implementation, I made the choice to replace the linked list of `PointerRec` entries (that had to go anyway) with a simple flat vector of `MemoryLocation` objects, but for the `AliasSet` objects referenced from a lookup table, I retained the mechanism of a linked list, reference counting, forwarding, etc. The data structures could be revised in a follow-up change.	2024-01-17 15:59:13 +01:00
Nikita Popov	bf5d96c96c	[IR] Add dead_on_unwind attribute (#74289 ) Add the `dead_on_unwind` attribute, which states that the caller will not read from this argument if the call unwinds. This allows eliding stores that could otherwise be visible on the unwind path, for example: ``` declare void @may_unwind() define void @src(ptr noalias dead_on_unwind %out) { store i32 0, ptr %out call void @may_unwind() store i32 1, ptr %out ret void } define void @tgt(ptr noalias dead_on_unwind %out) { call void @may_unwind() store i32 1, ptr %out ret void } ``` The optimization is not valid without `dead_on_unwind`, because the `i32 0` value might be read if `@may_unwind` unwinds. This attribute is primarily intended to be used on sret arguments. In fact, I previously wanted to change the semantics of sret to include this "no read after unwind" property (see D116998), but based on the feedback there it is better to keep these attributes orthogonal (sret is an ABI attribute, dead_on_unwind is an optimization attribute). This is a reboot of that change with a separate attribute.	2023-12-14 09:58:14 +01:00
Bruno De Fraine	358e765680	[LICM] Add test show missed promotion due to AAInfo merging (NFC)	2023-12-13 12:01:30 +01:00
Nikita Popov	753c51bf88	[AST] Fix size merging for MustAlias sets (#73820 ) AST checks aliasing with MustAlias sets by only checking the representative pointer (getSomePointer). This is only correct if the Size and AATags information of that pointer also includes the Size/AATags of all other pointers in the set. When we add a new pointer to the AliasSet, we do perform this update (see the code in AliasSet::addPointer). However, if a pointer already in the MustAlias set is used with a new size, we currently do not update the representative pointer, resulting in miscompilations. Fix this by adding the missing update. This is a targeted fix using the current representation. There are a couple of alternatives: * For MustAlias sets, don't store per-pointer Size/AATags at all. This would make it clear that there is only one set of common Size/AATags for all pointers. * Check against all pointers in the set even for MustAlias. This is what https://github.com/llvm/llvm-project/pull/65731 proposes to do as part of a larger change to AST representation. Fixes https://github.com/llvm/llvm-project/issues/64897.	2023-12-07 10:45:48 +01:00
Jeremy Morse	5ba5211a47	[DebugInfo][RemoveDIs] Have LICM insert at iterator positions (#73671 ) Because we're storing some extra debug-info information in the iterator class, we need to insert new LICM-created stores using such iterators. Switch LICM to storing iterators instead of pointers when it promotes variables in loops, add a test for the desired behaviour, and enable RemoveDIs instrumentation on a variety of other LICM tests for good measure. (This would appear to be the only pass in LLVM that needs to store iterators on the heap).	2023-11-30 13:00:26 +00:00
Nikita Popov	672b3d0974	[LICM] Add test for #64897 (NFC)	2023-11-29 16:56:07 +01:00
Jeremy Morse	d2d9dc8eb4	[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251 ) Debugify is extremely useful as a testing and debugging tool, and a good number of LLVM-IR transform tests use it. We need it to support "new" non-instruction debug-info to get test coverage, but it's not important enough to completely convert right now (and it'd be a large undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and exit of the pass, which gives us the functionality without any further work. The cost is compile-time, but again this is only happening during tests. Tested by: the large set of debugify tests enabled here. Note the InstCombine test (cast-mul-select.ll) that hasn't been fully enabled: this is because there's a debug-info sinking piece of code there that hasn't been instrumented.	2023-11-29 13:19:50 +00:00
Nikita Popov	6b8ed78719	[IR] Add writable attribute This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that this many bytes are writable without trapping or introducing data races. See https://llvm.org/docs/Atomics.html#optimization-outside-atomic for why the second point is important. This attribute can be added to sret arguments. I believe Rust will also be able to use it for by-value (moved) arguments. Rust likely won't be able to use it for &mut arguments (tree borrows does not appear to allow spurious stores). In this patch the new attribute is only used by LICM scalar promotion. However, the actual motivation for this is to fix a correctness issue in call slot optimization, which needs this attribute to avoid optimization regressions. Followup to the discussion on D157499. Differential Revision: https://reviews.llvm.org/D158081	2023-11-01 10:46:31 +01:00
Matthias Braun	e3cf80c5c1	BlockFrequencyInfoImpl: Avoid big numbers, increase precision for small spreads BlockFrequencyInfo calculates block frequencies as Scaled64 numbers but as a last step converts them to unsigned 64bit integers (`BlockFrequency`). This improves the factors picked for this conversion so that: * Avoid big numbers close to UINT64_MAX to avoid users overflowing/saturating when adding multiply frequencies together or when multiplying with integers. This leaves the topmost 10 bits unused to allow for some room. * Spread the difference between hottest/coldest block as much as possible to increase precision. * If the hot/cold spread cannot be represented loose precision at the lower end, but keep the frequencies at the upper end for hot blocks differentiable.	2023-10-24 20:27:39 -07:00
Anton Korobeynikov	51d5d7bbae	Extend `retcon.once` coroutines lowering to optionally produce a normal result (#66333 ) One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes. However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return` is transformed to a member / callback call on promise object). The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.	2023-09-15 09:54:38 -07:00
Nikita Popov	4eafc9b6ff	[IR] Treat callbr as special terminator (PR64215) isLegalToHoistInto() currently return true for callbr instructions. That means that a callbr with one successor will be considered a proper loop preheader, which may result in instructions that use the callbr return value being hoisted past it. Fix this by adding callbr to isExceptionTerminator (with a rename to isSpecialTerminator), which also fixes similar assumptions in other places. Fixes https://github.com/llvm/llvm-project/issues/64215. Differential Revision: https://reviews.llvm.org/D158609	2023-08-25 09:20:18 +02:00
Craig Topper	dc02070d69	[LICM] Check hasNoSignedZeros in hoistFPAssociation. This matches the check done by the Reassociate pass that we're trying to reverse. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D158042	2023-08-23 14:05:34 -07:00
Nikita Popov	69bd66b3ce	[Tests] Remove some and/or constant expressions in tests (NFC) In preparation for their removal in D158081.	2023-08-21 12:05:32 +02:00
Nikita Popov	cc488b80ad	[DSE][LICM] Regenerate test checks (NFC) Avoid spurious variable name changes in future patch.	2023-08-09 14:49:15 +02:00
Paul Osmialowski	8698d56d99	[Transforms][LICM] Add the ability to undo unprofitable reassociation Consider the following piece of code: ``` void innermost_loop(int i, double d1, double d2, double delta, int n, double cells[n]) { int j; const double d1d = d1 * delta; const double d2d = d2 * delta; for (j = 0; j <= i; j++) cells[j] = d1d * cells[j + 1] + d2d * cells[j]; } ``` When compiling at -Ofast level, after the "Reassociate expressions" pass, this code is transformed into an equivalent of: ``` int j; for (j = 0; j <= i; j++) cells[j] = (d1 * cells[j + 1] + d2 * cells[j]) * delta; ``` Effectively, the computation of those loop invariants isn't done before the loop anymore, we have one extra multiplication on each loop iteration instead. Sadly, this results in a significant performance hit. Similarly, specifically crafted user code will also experience inability to hoist those invariants. This patch is solving this issue by adding the ability to undo such reassociation into the LICM pass. Note that for doing such transformation this pass requires the same conditions as the "Reassociate expressions" pass, namely, the involved binary operators must have the reassociations allowed (e.g. by specifying the `fast` attribute) and they must have single use only. Some parts of this patch were suggested by Nikita Popov. Reviewed By: huntergr, nikic, paulwalker-arm Differential Revision: https://reviews.llvm.org/D152281	2023-08-01 16:42:01 +01:00
Paul Osmialowski	89e25a3d65	[Transforms][LICM] A test case for the upcoming fix D152281 for the issue with reassociation profitability This commit introduces a test for the upcoming change addressing the following issue: https://github.com/llvm/llvm-project/issues/62736 Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D152282	2023-08-01 16:42:00 +01:00
Wenlei He	9a868a902c	[LoopSink] Allow sinking to PHI-use (2nd attempt) This change allows sinking defs from loop preheader with PHI-use into loop body. Loop sink can now see through PHI-use and select incoming blocks of value being used as candidate sink destination. It makes loop sink more effective so more LICM can be undone if proven unprofitable with profile info. It addresses the motivating case in D87551, without resorting to profile guided LICM which breaks canonicalization. This is the 2nd attempt after D152772.	2023-06-23 09:52:03 -07:00
Alexander Kornienko	c96c85aba2	Revert "[LoopSink] Allow sinking to PHI-use" This reverts commit 54711a6a5872d5f97da4c0a1bd7e58d0546ca701. The commit is causing a clang crash: https://reviews.llvm.org/D152772#4437254	2023-06-21 17:37:11 +02:00
Carlos Alberto Enciso	c0a986a60f	[LICM] Sunk instructions with invalid source location. Building the given test case with 'clang -O2 -g' the call to 'getInOrder' is sunk out of the loop by LICM, but the source location is not dropped. Reviewed By: aprantl, fdeazeve Differential Revision: https://reviews.llvm.org/D152691	2023-06-16 06:25:27 +01:00
Wenlei He	54711a6a58	[LoopSink] Allow sinking to PHI-use This change allows sinking defs from loop preheader with PHI-use into loop body. Loop sink can now see through PHI-use and select incoming blocks of value being used as candidate sink destination. It makes loop sink more effective so more LICM can be undone if proven unprofitable with profile info. It addresses the motivating case in D87551, without resorting to profile guided LICM which breaks canonicalization. Differential Revision: https://reviews.llvm.org/D152772	2023-06-13 13:06:57 -07:00
Chuanqi Xu	84c033d9ba	[LICM] [Coroutines] Don't hoist threadlocals within presplit coroutines Close https://github.com/llvm/llvm-project/issues/63022 This is the following of https://reviews.llvm.org/D135550, which is discussed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. In my imagination, we could fix the issue fundamentally after we introduces new memory kind thread id. But I am not very sure if we can fix the issue fundamentally in time. Besides that, I think the correctness is the most important. So it should not be bad to land this given it is innocent. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151774	2023-06-07 10:25:47 +08:00

1 2 3 4 5 ...

573 Commits