llvm-project

Author	SHA1	Message	Date
Florian Hahn	977c0a6d29	[LAA] Add tests with non-constant strides & distances. Add a number of LAA test cases with both forward and backward dependences with non-constant strides and dependence distances. This includes test coverage for https://github.com/llvm/llvm-project/issues/87336 Also includes a LoopLoadElimination test to make sure the pass does not crash on non-constant dependence distances.	2024-04-08 19:18:38 +01:00
Florian Hahn	a3ad5faa32	[LAA] Fix typo IndidrectUnsafe -> IndirectUnsafe. Fix type in textual analysis output.	2024-03-12 14:44:04 +00:00
Florian Hahn	b274b23665	[ValueTracking] Treat phi as underlying obj when not decomposing further (#84339 ) At the moment, getUnderlyingObjects simply continues for phis that do not refer to the same underlying object in loops, without adding them to the list of underlying objects, effectively ignoring those phis. Instead of ignoring those phis, add them to the list of underlying objects. This fixes a miscompile where LoopAccessAnalysis fails to identify a memory dependence, because no underlying objects can be found for a set of memory accesses. Fixes https://github.com/llvm/llvm-project/issues/82665. PR: https://github.com/llvm/llvm-project/pull/84339	2024-03-12 08:55:03 +00:00
Florian Hahn	4cfd4a7896	[LAA] Add test case for #82665 . Test case for https://github.com/llvm/llvm-project/issues/82665.	2024-03-07 13:53:03 +00:00
Fangrui Song	3d18c8cd26	[test] Replace aarch64-*-{eabi,gnueabi}{,hf} with aarch64 Similar to d39b4ce3ce8a3c256e01bdec2b140777a332a633 Using "eabi" or "gnueabi" for aarch64 targets is a common mistake and warned by Clang Driver. We want to avoid them elsewhere as well. Just use the common "aarch64" without other triple components.	2024-02-12 18:29:55 -08:00
Nikita Popov	1aee1e1f4c	[Analysis] Convert tests to opaque pointers (NFC)	2024-02-05 12:04:39 +01:00
Nikita Popov	cd7ea4ea65	[LAA] Drop alias scope metadata that is not valid across iterations (#79161 ) LAA currently adds memory locations with their original AATags to AST. However, scoped alias AATags may be valid only within one loop iteration, while LAA reasons across iterations. Fix this by determining which alias scopes are defined inside the loop, and drop AATags that reference these scopes. Fixes https://github.com/llvm/llvm-project/issues/79137.	2024-01-24 11:20:16 +01:00
Nikita Popov	0c02b2e0e0	[LAA] Add test for #79137 (NFC)	2024-01-23 16:54:25 +01:00
Florian Hahn	184290e579	[LAA] Add tests with dependencies may preventing st-to-ld forwarding. Add test cases with varying distances between stores and loads that may prevent store-to-load forwarding.	2023-12-10 13:56:53 +00:00
Florian Hahn	cd4067af36	[LAA] Remove duplicated test. depend_diff_types.ll already covers the same tests afer it hs been converted to opaque pointersj, so remove the redundant depend_diff_types_opaque_ptr.ll	2023-12-09 21:27:42 +00:00
Alexandros Lamprineas	3ad6d1cbe5	[LAA] Fix incorrect dependency classification. (#70819 ) As shown in #70473, the following loop was not considered safe to vectorize. When determining the memory access dependencies in a loop which has negative iteration step, we invert the source and sink of the dependence. Perhaps we should just invert the operands to getMinusSCEV(). This way the dependency is not regarded to be true, since the users of the `IsWrite` variables, which correspond to each of the memory accesses, rely on program order and therefore should not be swapped. void vectorizable_Read_Write(int *A) { for (unsigned i = 1022; i >= 0; i--) A[i+1] = A[i] + 1; }	2023-12-05 15:27:30 +00:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Florian Hahn	17139f38e5	[LAA] Check HasSameSize before couldPreventStoreLoadForward. After 9645267, TypeByteSize is 0 if both access do not have the same size (i.e. HasSameSize will be false). This can cause an infinite loop in couldPreventStoreLoadForward, if HasSameSize is not checked first. So check HasSameSize first instead of after couldPreventStoreLoadForward. Checking HasSameSize first is also cheaper.	2023-11-27 10:10:41 +00:00
Florian Hahn	2fda8ca6da	[LAA] Auto-generate checks for forward-loop-carried.ll Auto-generate checks for -loop-carried.ll to make it easier to update in follow-on patch. As this test only checks the dependence, mark pointers as noalias to avoid also checking various runtime pointer check groups.	2023-11-27 10:06:17 +00:00
Florian Hahn	5d353423c9	[LAA] Add extra test for #70819 showing incorrect Forward dep. Add an additional test case where we currently incorrectly identify a dependence as Foward instead of ForwardButPreventsForwarding. Also cleans up the names in the tests a bit to improve readability.	2023-11-20 11:18:13 +00:00
Florian Hahn	1dbcaf2777	[LAA] Check if dependencies access loop-varying underlying objects. This patch adds a new dependence kind UnsafeIndirect, for cases where at least one of the memory access instructions may access a loop varying object, e.g. the address of underlying object is loaded inside the loop, like A[B[i]]. We cannot determine direction or distance in those cases, and also are unable to generate any runtime checks. This fixes a miscompile, if we attempt to generate runtime checks for unknown dependencies. Note that in most cases we do not attempt to generate runtime checks for unknown dependences, except if FoundNonConstantDistanceDependence is true. Fixes https://github.com/llvm/llvm-project/issues/69744.	2023-11-15 21:58:57 +00:00
Florian Hahn	c491c93365	[LAA] Refine tests added in 9c535a3c2ef. Refine FIXMEs in added tests, the problematic case only materializes if there's either both a read and write from an indirect address.	2023-11-13 19:19:57 +00:00
Florian Hahn	24839c3253	[UTC] Escape multiple {{ or }} in input for check lines. (#71790 ) SCEV expressions may contain multiple {{ or }} in the debug output, which needs escaping. See llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll for a test that needs escaping.	2023-11-09 17:18:11 +00:00
Florian Hahn	9c535a3c2e	[LAA] Add tests for #69744 . Note that both loops in the tests are needed to incorrectly determine that the loops are safe with runtime checks via FoundNonConstantDistanceDependence handling code in LAA.	2023-11-09 09:59:48 +00:00
Alexandros Lamprineas	7d21d7395c	[LAA] Add a test case to show incorrect dependency classification (NFC). (#70473 ) Currently the loop access analysis classifies this loop as unsafe to vectorize because the memory dependencies are 'ForwardButPreventsForwarding'. However, the access pattern is 'write-after-read' with no subsequent read accessing the written memory locations. I can't see how store-to-load forwarding is applicable here. void vectorizable_Read_Write(int *A) { for (unsigned i = 1022; i >= 0; i--) A[i+1] = A[i] + 1; }	2023-10-31 15:01:28 +00:00
Ramkumar Ramachandra	4c01a58008	update_analyze_test_checks: support output from LAA (#67584 ) update_analyze_test_checks.py is an invaluable tool in updating tests. Unfortunately, it only supports output from the CostModel, ScalarEvolution, and LoopVectorize analyses. Many LoopAccessAnalysis tests use hand-crafted CHECK lines, and it is moreover tedious to generate these CHECK lines, as the output fom the analysis is not stable, and requires the test-writer to hand-craft FileCheck matches. Alleviate this pain, and support output from: $ opt -passes='print<loop-accesses>' This patch includes several non-trivial changes including: - Preserving whitespace at the beginning of the line, so that the LAA output can be properly indented. - Regexes matching the unstable output, which is basically a pointer address hex. - Separating is_analyze from preserve_names clearly, as the former was formerly used as an overload for the latter. To demonstate the utility of this patch, several tests in LoopAccessAnalysis have been auto-generated by update_analyze_test_checks.py.	2023-10-31 14:33:53 +00:00
Allen	46cb7e4eea	[LoopDist] Update the pragma info of loop distribute, NFC (#69825 ) Base on D19403, the exact pragma of distribute is `#pragma clang loop distribute`	2023-10-28 17:47:46 +08:00
Allen	48caa0723c	[LAA] Analyze pointers forked by a phi (#65834 ) Given a function like the following: https://godbolt.org/z/T9c99fr88 ```c 1161_noReadWrite(int Preds) { for (int i = 0; i < LEN_1D-1; ++i) { if (Preds[i] != 0) b[i] = c[i] + 1; else a[i] = i i; } } ``` LLVM will optimize the IR to a single store by a phi instruction: ```llvm %1 = load ptr, ptr @a, align 64 %2 = load ptr, ptr @b, align 64 ... for.inc: %.sink = phi ptr [ %1, %if.then ], [ %2, %if.else ] %add.sink = phi double [ %add, %if.then ], [ %conv8, %if.else ] %arrayidx7 = getelementptr inbounds double, ptr %.sink, i64 %indvars.iv store double %add.sink, ptr %arrayidx7, align 8 ``` LAA is currently unable to analyze such IR, since ScalarEvolution will return a SCEVUnknown for the forked pointer operand of the store. This patch adds initial optional support for analyzing both possibilities for the pointer and allowing LAA to generate runtime checks for the bounds if required, refers to D108699, but here address the phi node. Fixes https://github.com/llvm/llvm-project/issues/64888 Reviewed By: huntergr-arm, fhahn Differential Revision: https://reviews.llvm.org/D158965	2023-09-19 09:16:47 +08:00
Nikita Popov	efe4e7a026	[SCEV] Fix incorrect nsw inference for multiply of addrec (#66500 ) SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw. This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw (https://alive2.llvm.org/ce/z/wef9su). Fixes https://github.com/llvm/llvm-project/issues/66066.	2023-09-18 08:23:10 +02:00
Allen	76c6a8bd36	[LAA] Improve the output remark for LoopVectorize (#65832 ) Don't report 'Use #pragma loop distribute(enable) to allow loop distribution' when we already add #pragma clang loop distribute(enable) Fixes https://github.com/llvm/llvm-project/issues/64637	2023-09-16 12:51:44 +08:00
Michael Maitland	87ddd3a191	[LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes `MaxSafeDepDistBytes` was not correct based on its name an semantics in instances when there was a non-unit stride loop. For example, ``` for (int k = 0; k < len; k+=3) { a[k] = a[k+4]; a[k+2] = a[k+6]; } ``` Here, the smallest dependence distance is 24 bytes, but only vectorizing 8 bytes is safe. `MaxSafeVectorWidthInBits` reported the correct number of bits that could be vectorized as 64 bits. The semantics of of `MaxSafeDepDistBytes` should be: The smallest dependence distance in bytes in the loop. This may not be the same as the maximum number of bytes that are safe to operate on simultaneously. The name of this variable should reflect those semantics and its docstring should be updated accordingly, `MinDepDistBytes`. A debug message that used `MaxSafeDepDistBytes` to signify to the user how many bytes could be accessed in parallel is updated to use `MaxSafeVectorWidthInBits` instead. That way, the same message if communicated to the user, just in different units. This patch makes sure that when `MinDepDistBytes` is modified in a way that should impact `MaxSafeVectorWidthInBits`, that we update the latter accordingly. This patch also clarifies why `MaxSafeVectorWidthInBits` does not to be updated when `MinDepDistBytes` is (i.e. in the case of a forward dependency). Differential Revision: https://reviews.llvm.org/D156158	2023-08-16 09:53:35 -07:00
Nikita Popov	edb2fc6dab	[llvm] Remove explicit -opaque-pointers flag from tests (NFC) Opaque pointers mode is enabled by default, no need to explicitly enable it.	2023-07-12 14:35:55 +02:00
Michael Maitland	aef6d4610f	[LAA] Add test that shows MaxSafeDepDistBytes is incorrect. NFC. This precommit patch shows MaxSafeDepBytesDist is 24 when it should be 8. Differential Revision: https://reviews.llvm.org/D154173	2023-07-06 14:37:30 -07:00
Philip Reames	78ae870f11	{tests] Rerun autogen to reduce a diff [nfc]	2023-03-31 12:47:08 -07:00
Bjorn Pettersson	81d6310da1	[LAA] Fix transitive analysis invalidation bug by implementing LoopAccessInfoManager::invalidate The default invalidate method for analysis results is just looking at the preserved state of the pass itself. It does not consider if the analysis has an internal state that depend on other analyses. Thus, we need to implement LoopAccessInfoManager::invalidate in order to catch if LoopAccessAnalysis needs to be invalidated due to transitive analyses such as AAManager is being invalidated. Otherwise we might end up having references to an AAManager that is stale. Fixes https://github.com/llvm/llvm-project/issues/61324 Differential Revision: https://reviews.llvm.org/D146206	2023-03-17 09:33:16 +01:00
Nikita Popov	12aef5df0c	[LAA] Convert test to opaque pointers (NFC) When converting this test to opaque pointers (and dropping bitcast), we get improved memory checks. Per fhahn: > It looks like the difference is due to the logic that determines > pointer strides in LAA not handling bitcasts. Without the > bitcasts, the logic now triggers successfully. Differential Revision: https://reviews.llvm.org/D140204	2022-12-19 16:58:33 +01:00
Nikita Popov	05dc149c87	[LAA] Convert tests to opaque pointers (NFC)	2022-12-16 12:45:59 +01:00
Nikita Popov	07e1c9978d	[LAA] Name instructions in test (NFC) And regenerate test checks.	2022-12-16 12:32:58 +01:00
Roman Lebedev	543f1aa603	[NFC] Port all Analysis/LoopAccessAnalysis tests to `-passes=` syntax	2022-12-09 01:04:45 +03:00
Roman Lebedev	b1a9584818	[opt] Disincentivize new tests from using old pass syntax Over the past day or so, i've took a large swing at our tests, and reduced the number of tests that were still using the old syntax from ~1800 to just 200. Left to handle: (as it is seen in this patch) * Transforms/LSR * Transforms/CGP * Transforms/TypePromotion * Transforms/HardwareLoops * Analysis/* * some misc. I think this is the right point to start actively refusing to honor the old syntax, except for the old tests, to prevent the old syntax from creeping back in. Thus, let's add temporary default-off flag, and if it is not passed refuse to accept old syntax. The tests that still need porting are annotated with this flag. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D139647	2022-12-08 23:54:03 +03:00
Nikita Popov	4de3184f07	[LAA] Use cross-iteration alias analysis LAA analyzes cross-iteration memory dependencies, as such AA should not make assumptions about equality of values inside the loop, as they may come from different iterations. Fix this by exposing the MayBeCrossIteration AA flag and enabling it for LAA. Differential Revision: https://reviews.llvm.org/D137958	2022-12-05 09:27:13 +01:00
Graham Hunter	3c74ed9ee3	[LAA] Fix ICE with scAddExpr in forked pointers The IR from https://github.com/llvm/llvm-project/issues/57368 results in an assert firing when trying to create a runtime check for the forked pointer. One of the forks is fine since it's loop invariant, but the other is a scAddExpr (containing a scAddRecExpr, so not invariant) when RtCheck::insert expects a scAddRecExpr. This is a simple fix to just avoid forks which aren't AddRec or loop invariant. We can allow it as a forked pointer later with more work. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133020	2022-09-21 10:27:06 +01:00
Florian Hahn	555e09c2b0	[LAA] Rename printing pass to print<access-info>. This updates the naming for the LAA printing pass to be in line with most other analysis printing passes. The old name has come up as confusing multiple times already, e.g. in D131924.	2022-08-26 11:00:09 +01:00
Florian Hahn	494b6c46d6	[LAA] Add test cases where BTC can be used to rule out dependences. Test cases for using the backedge-taken-count to rule out dependencies between an invariant and strided accesses.	2022-08-22 13:11:26 +01:00
Graham Hunter	70d35443dc	[LAA] Handle forked pointers with add/sub instructions Handle cases where a forked pointer has an add or sub instruction before reaching a select. Reviewed By: fhahn Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130278	2022-08-17 09:51:13 +01:00
Max Kazantsev	8e9e27ae90	[Test] Fix block name in test	2022-07-28 13:42:14 +07:00
Max Kazantsev	2d1c6e0b44	[LAA] Remove block order sensitivity in LAA algorithm. PR56672 As test in PR56672 shows, LAA produces different results which lead to either positive or negative vectorization decisions depending on the order of blocks in loop. The exact reason of this is not clear to me, however this makes investigation of related bugs extremely complex. Current order of blocks in the loop is arbitrary. It may change, for example, if loop info analysis is dropped and recomputed. Seems that it interferes with LAA's logic. This patch chooses fixed traversal order of blocks in loops, making it RPOT. Note: this is not a fix for bug with incorrect analysis result. It just makes the answer more robust to make the investigation easier. Differential Revision: https://reviews.llvm.org/D130482 Reviewed By: aeubanks, fhahn	2022-07-28 13:36:56 +07:00
Graham Hunter	0a715c1146	[LAA] Precommit add/sub tests for forked pointers Adds new tests for add and sub instructions before reaching a select. Also adds tests using different bit widths for memory, including non-power-of-two integers.	2022-07-21 15:16:15 +01:00
Graham Hunter	db8fcb2c25	[LAA] Add recursive IR walker for forked pointers This builds on the previous forked pointers patch, which only accepted a single select as the pointer to check. A recursive function to walk through IR has been added, which searches for either a loop-invariant or addrec SCEV. This will only handle a single fork at present, so selects of selects or a GEP with a select for both the base and offset will be rejected. There is also a recursion limit with a cli option to change it. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D108699	2022-07-18 12:06:17 +01:00
Graham Hunter	a19cf47da0	[LAA] Precommit some extra tests for forked pointers * Converted tests to use opaque pointers * Added suggested test for inbounds GEP * Added a test for forks on both the base and offset terms of a GEP * Added a test for a select of a select * Added a test for a GEP with >2 operands * Added a test for vector GEPs	2022-07-13 10:32:35 +01:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit 7aa8a678826dea86ff3e6c7df9d2a8a6ef868f5d. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Alexander Kornienko	7aa8a67882	Revert "[LAA] Initial support for runtime checks with pointer selects." This reverts commit 5890b30105999a137e72e42f3760bebfd77001ca as per discussion on the review thread: https://reviews.llvm.org/D114487#3547560.	2022-06-01 15:24:27 +02:00
Florian Hahn	5890b30105	[LAA] Initial support for runtime checks with pointer selects. Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487	2022-05-12 19:33:48 +01:00
Florian Hahn	3c14836093	[LAA] Add test with simpler load of pointer select. Add a simpler test for D114487/D108699.	2022-04-10 23:54:41 +02:00
Arthur Eubanks	f72b76cde5	[test] Replace/remove some 'opt -analyze' RUN lines	2022-02-09 15:49:53 -08:00

1 2 3

148 Commits