llvm-project

Author	SHA1	Message	Date
Philip Reames	30cdb2ac7e	[LAA] Add command line flag to disable unit stride speculation This is purely so that we can expose and work through downstream codegen issues. My intention is to see if we can get this disabled by default, but that requires fixing a bunch of downstream issues first.	2023-05-01 10:49:51 -07:00
Philip Reames	4343534a67	[LAA] Rework overflow checking in getPtrStride [nfc] The previous code structure and comments were exceedingly confusing. I have multiple times looked at this code and suspected a bug. This time, I decided to take the time to reflow the code and comment out why it is correct. The only suspect (to me) case left is that an underaligned access with a unit stride (in terms of the access type) might miss the undefined null pointer when wrapping. This is unlikely to be an issue for C/C++ code with real page sizes, so I'm not bothering to fully convince myself whether that case is correct or not.	2023-05-01 10:21:02 -07:00
Philip Reames	89a44b0fae	[LAA] Use early return [nfc]	2023-05-01 08:35:56 -07:00
Paul Osmialowski	9cf1881f8f	[SCEV] Do not plant SCEV checks unnecessarily The vectorisation analysis collects strides for loop invariant pointers, which is wrong because they are not strided. We don't need to generate SCEV checks (which are costly performancewise) for such pointers, we just need to do the appropriate aliasing checks. This patch fixes the problem by changing getStrideFromPointer() to treat loop invariant pointers as having no stride. Originally proposed by David Sherwood with further suggestions from Florian Hahn. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D146958	2023-04-25 21:47:14 +01:00
Philip Reames	0437f88b77	[LAA] Cleanup casting in replaceSymbolicStrideSCEV [nfc]	2023-04-06 09:13:55 -07:00
Philip Reames	2d79b71366	[LAA] Continue moving utilities to sole use to isolate symbolic stride reasoning [nfc]	2023-04-06 08:27:57 -07:00
Philip Reames	800a99c4f4	[LAA] Group implementation of stride speculation into one file [nfc] These utilities are only used in one place, so move them there and make them static.	2023-04-05 20:39:08 -07:00
Bjorn Pettersson	951a980dc7	[Analysis] Make order of analysis executions more stable When debugging and using debug-pass-manager (e.g. in regression tests) we prefer a consistent order in which analysis passes are executed. But when for example doing return MyClass(AM.getResult<LoopAnalysis>(F), AM.getResult<DominatorTreeAnalysis>(F)); then the order in which LoopAnalysis and DominatorTreeAnalysis isn't guaranteed, and might for example depend on which compiler that is used when building LLVM. I've not scanned the full source tree, but this fixes some occurances of the above pattern found in lib/Analysis. This problem was discussed briefly in review for D146206.	2023-03-17 09:33:16 +01:00
Bjorn Pettersson	81d6310da1	[LAA] Fix transitive analysis invalidation bug by implementing LoopAccessInfoManager::invalidate The default invalidate method for analysis results is just looking at the preserved state of the pass itself. It does not consider if the analysis has an internal state that depend on other analyses. Thus, we need to implement LoopAccessInfoManager::invalidate in order to catch if LoopAccessAnalysis needs to be invalidated due to transitive analyses such as AAManager is being invalidated. Otherwise we might end up having references to an AAManager that is stale. Fixes https://github.com/llvm/llvm-project/issues/61324 Differential Revision: https://reviews.llvm.org/D146206	2023-03-17 09:33:16 +01:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00
Fangrui Song	d4b6fcb32e	[Analysis] llvm::Optional => std::optional	2022-12-14 07:32:24 +00:00
Nikita Popov	4de3184f07	[LAA] Use cross-iteration alias analysis LAA analyzes cross-iteration memory dependencies, as such AA should not make assumptions about equality of values inside the loop, as they may come from different iterations. Fix this by exposing the MayBeCrossIteration AA flag and enabling it for LAA. Differential Revision: https://reviews.llvm.org/D137958	2022-12-05 09:27:13 +01:00
Nikita Popov	e95ca5bb05	[AST] Make AliasSetTracker work on BatchAA D138014 restricted AST to work on immutable IR. This means it is also safe to use a single BatchAA instance for the entire AST lifetime, instead of only batching parts of individual queries. The primary motivation for this is not compile-time, but rather having a central place to control cross-iteration AA, which will be used by D137958. Differential Revision: https://reviews.llvm.org/D137955	2022-12-05 08:12:26 +01:00
Benjamin Kramer	856f7937c7	Compress a few pairs using PointerIntPairs Use the uniform structured bindings interface where possible. NFCI.	2022-12-04 16:55:16 +01:00
Kazu Hirata	19aff0f37d	[Analysis] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 19:43:04 -08:00
Florian Hahn	db720dc17c	[LAA] Use LoopAccessInfoManager in legacy pass. Simplify LoopAccessLegacyAnalysis by using LoopAccessInfoManager from D134606. As a side-effect this also removes printing support from LoopAccessLegacyAnalysis. Depends on D134606. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134608	2022-10-04 08:37:11 +01:00
Florian Hahn	7c0ff64b0f	[LAA] Change to function analysis for new PM. At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606	2022-10-01 15:44:27 +01:00
Philip Reames	f6d110e26f	[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.	2022-09-27 15:55:44 -07:00
Graham Hunter	3c74ed9ee3	[LAA] Fix ICE with scAddExpr in forked pointers The IR from https://github.com/llvm/llvm-project/issues/57368 results in an assert firing when trying to create a runtime check for the forked pointer. One of the forks is fine since it's loop invariant, but the other is a scAddExpr (containing a scAddRecExpr, so not invariant) when RtCheck::insert expects a scAddRecExpr. This is a simple fix to just avoid forks which aren't AddRec or loop invariant. We can allow it as a forked pointer later with more work. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133020	2022-09-21 10:27:06 +01:00
Kazu Hirata	ce9f007c7c	[llvm] Use llvm::find_if (NFC)	2022-08-28 10:41:48 -07:00
Florian Hahn	9405af1c85	[LAA] Require AddRecs to be in the innermost loop for diff-checks. The simpler diff-checks require pointers with add-recs from the same innermost loop, but this property wasn't check completely. Add the missing check to ensure both addrecs are in the innermost loop. Fixes #57315.	2022-08-26 20:39:52 +01:00
Philip Reames	86b67a310d	[LAA] Prune dependencies with distance large than access implied by trip count When we have a dependency with a dependence distance which can only be hit on an iteration beyond the actual trip count of the loop, we can ignore that dependency when analyzing said loop. We already had this code, but had restricted it solely to unknown dependence distances. This change applies it to all dependence distances. Without this code, we relied on the vectorizer reducing VF such that our infeasible dependence was respected. This usually worked out to about the same result, but not always. For fixed length vectorization, this could mean a smaller VF than optimal being chosen or additional runtime checks. For scalable vectorization - where the bounds on access implied by VF are broader - we could often not find a feasible VF at all. Differential Revision: https://reviews.llvm.org/D131924	2022-08-25 14:24:13 -07:00
Florian Hahn	c035efc814	[LAA] Cache PSE.getSE() in variable (NFC). Preparation for follow-up patches will introduce additional uses of SE.	2022-08-25 21:40:22 +01:00
Aditya Kumar	0af3ab02fd	[NFC] LoopAccess: Move expressions close to usage Avoids useless evaluation of these expressions. Reviewed By: michaelmaitland, fhahn Differential Revision: https://reviews.llvm.org/D132337	2022-08-23 07:08:42 -07:00
Michael Maitland	f29401fcdf	[LoopVectorize][LoopAccessAnalysis] add newline to debug message A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132172	2022-08-18 13:44:05 -07:00
Graham Hunter	70d35443dc	[LAA] Handle forked pointers with add/sub instructions Handle cases where a forked pointer has an add or sub instruction before reaching a select. Reviewed By: fhahn Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130278	2022-08-17 09:51:13 +01:00
Max Kazantsev	2d1c6e0b44	[LAA] Remove block order sensitivity in LAA algorithm. PR56672 As test in PR56672 shows, LAA produces different results which lead to either positive or negative vectorization decisions depending on the order of blocks in loop. The exact reason of this is not clear to me, however this makes investigation of related bugs extremely complex. Current order of blocks in the loop is arbitrary. It may change, for example, if loop info analysis is dropped and recomputed. Seems that it interferes with LAA's logic. This patch chooses fixed traversal order of blocks in loops, making it RPOT. Note: this is not a fix for bug with incorrect analysis result. It just makes the answer more robust to make the investigation easier. Differential Revision: https://reviews.llvm.org/D130482 Reviewed By: aeubanks, fhahn	2022-07-28 13:36:56 +07:00
Kazu Hirata	acf648b5e9	Use llvm::less_first and llvm::less_second (NFC)	2022-07-24 16:21:29 -07:00
Kazu Hirata	97718180d7	[Analysis] Remove a redundant return statement (NFC) Identified with readability-redundant-control-flow.	2022-07-23 11:35:19 -07:00
Arthur Eubanks	04d398db46	[LoopAccessAnalysis] Simplify D119047 No need to add checks for every type per pointer that we couldn't create a check for the first time around, just the types that weren't successful. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D119376	2022-07-21 12:16:02 -07:00
Philip Reames	f494f89b2a	[LAA] Fix latent missing check bug when mixing scalable and non-scalabe strides Noticed via inspection; to my knowledge, impossible to hit today. In theory, we could have a fixed stride check be analyzed, then a scalable one. With the old code, the scalable one would be silently dropped, and the runtime guard would go ahead with only the fixed one. This would be a miscompile.	2022-07-20 11:56:45 -07:00
Benjamin Kramer	4bd072c56b	[LAA] Fix the build with older versions of Clang llvm/lib/Analysis/LoopAccessAnalysis.cpp:916:12: error: no viable conversion from returned value of type 'SmallVector<[...], 2>' to function return type 'SmallVector<[...], (default) CalculateSmallVectorDefaultInlinedElements<T>::value aka 3>' return Scevs; ^~~~~	2022-07-18 14:01:47 +02:00
Graham Hunter	db8fcb2c25	[LAA] Add recursive IR walker for forked pointers This builds on the previous forked pointers patch, which only accepted a single select as the pointer to check. A recursive function to walk through IR has been added, which searches for either a loop-invariant or addrec SCEV. This will only handle a single fork at present, so selects of selects or a GEP with a select for both the base and offset will be rejected. There is also a recursion limit with a cli option to change it. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D108699	2022-07-18 12:06:17 +01:00
Kazu Hirata	601b3a13de	[Analysis] Qualify auto variables in for loops (NFC)	2022-07-16 23:26:34 -07:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit 7aa8a678826dea86ff3e6c7df9d2a8a6ef868f5d. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Alexander Kornienko	7aa8a67882	Revert "[LAA] Initial support for runtime checks with pointer selects." This reverts commit 5890b30105999a137e72e42f3760bebfd77001ca as per discussion on the review thread: https://reviews.llvm.org/D114487#3547560.	2022-06-01 15:24:27 +02:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
Florian Hahn	5890b30105	[LAA] Initial support for runtime checks with pointer selects. Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487	2022-05-12 19:33:48 +01:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
Chang-Sun Lin Jr	7ee30a0e24	[NFC][LAA] Match-up type sizes for possible extensions, based on actual bit-size rather than rounded-up byte size. Differential Revision: https://reviews.llvm.org/D119200	2022-04-22 23:16:20 -07:00
Kazu Hirata	9aa52ba574	[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 18:21:40 -07:00
serge-sans-paille	71c3a5519d	Cleanup includes: LLVMAnalysis Number of lines output by preprocessor: before: 1065940348 after: 1065307662 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659	2022-03-01 18:01:54 +01:00
Malhar Jajoo	9f1c6fbf11	[LAA] Add remarks for unbounded array access Adds new optimization remarks when loop vectorization fails due to the compiler being unable to find bound of an array access inside a loop Differential Revision: https://reviews.llvm.org/D115873	2022-02-23 15:57:39 +00:00
Thomas Preud'homme	40f9081958	[LAA] Add missing newline in debug print	2022-02-23 13:25:16 +00:00
Philip Reames	5ba115031d	[PSE] Remove assumption that top level predicate is union from public interface [NFC] Note that this doesn't actually cause the top level predicate to become a non-union just yet. The above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.	2022-02-10 16:14:52 -08:00
Arthur Eubanks	ff31020ee6	[OpaquePtr][LoopAccessAnalysis] Support opaque pointers Previously we relied on the pointee type to determine what type we need to do runtime pointer access checks. With opaque pointers, we can access a pointer with more than one type, so now we keep track of all the types we're accessing a pointer's memory with. Also some other minor getPointerElementType() removals. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D119047	2022-02-09 09:11:27 -08:00
Malhar Jajoo	778b455dd6	[LAA] Add Memory dependence remarks. Adds new optimization remarks when vectorization fails. More specifically, new remarks are added for following 4 cases: - Backward dependency - Backward dependency that prevents Store-to-load forwarding - Forward dependency that prevents Store-to-load forwarding - Unknown dependency It is important to note that only one of the sources of failures (to vectorize) is reported by the remarks. This source of failure may not be first in program order. A regression test has been added to test the following cases: a) Loop can be vectorized: No optimization remark is emitted b) Loop can not be vectorized: In this case an optimization remark will be emitted for one source of failure. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D108371	2022-02-02 12:07:51 +00:00
Kazu Hirata	b752eb887f	[Analysis] Use default member initialization (NFC) Identified with modernize-use-default-member-init.	2022-01-23 20:32:56 -08:00
Florian Hahn	d8276208be	[LAA] Remove overeager assertion for aggregate types. 0a00d64 turned an early exit here into an assertion, but the assertion can be triggered, as PR52920 shows. The later code is agnostic to the accessed type, so just drop the assert. The patch also adds tests for LAA directly and loop-load-elimination to show the behavior is sane.	2022-01-04 15:20:35 +00:00
Jolanta Jensen	77b2bb5567	[LAA] Use type sizes when determining dependence. In the isDependence function the code does not try hard enough to determine the dependence between types. If the types are different it simply gives up, whereas in fact what we really care about are the type sizes. I've changed the code to compare sizes instead of types. Reviewed By: fhahn, sdesmalen Differential Revision: https://reviews.llvm.org/D108763	2021-12-08 15:00:58 +00:00

... 2 3 4 5 6 ...

492 Commits