llvm-project

Author	SHA1	Message	Date
Michael Kruse	22c77f2354	[Polly] Use separate DT/LI/SE for outlined subfn. NFC. (#102460 ) DominatorTree, LoopInfo, and ScalarEvolution are function-level analyses that expect to be called only on instructions and basic blocks of the function they were original created for. When Polly outlined a parallel loop body into a separate function, it reused the same analyses seemed to work until new checks to be added in #101198. This patch creates new analyses for the subfunctions. GenDT, GenLI, and GenSE now refer to the analyses of the current region of code. Outside of an outlined function, they refer to the same analysis as used for the SCoP, but are substituted within an outlined function. Additionally to the cross-function queries of DT/LI/SE, we must not create SCEVs that refer to a mix of expressions for old and generated values. Currently, SCEVs themselves do not "remember" which ScalarEvolution analysis they were created for, but mixing them is just as unexpected as using DT/LI across function boundaries. Hence `SCEVLoopAddRecRewriter` was combined into `ScopExpander`. `SCEVLoopAddRecRewriter` only replaced induction variables but left SCEVUnknowns to reference the old function. `SCEVParameterRewriter` would have done so but its job was effectively superseded by `ScopExpander`, and now also `SCEVLoopAddRecRewriter`. Some issues persist put marked with a FIXME in the code. Changing them would possibly cause this patch to be not NFC anymore.	2024-08-10 14:25:15 +02:00
Karthika Devi C	1e5334bcda	[Polly] Data flow reduction detection to cover more cases (#84901 ) The base concept is same as existing reduction algorithm where we get the list of candidate pairs <store,load>. But the existing algorithm works only if there is single binary operation between the load and store. Example sum += a[i]; This algorithm extends to work with more than single binary operation as well. It is implemented using data flow reduction detection on basic block level. We propagate the loads, the number of times the load is used(flows into instruction) and binary operation performed until we reach a store. Example sum += a[i] + b[i]; ``` sum(Ld) a[i](Ld) \ + / tmp b[i](Ld) \ + / sum(St) ``` In the above case the candidate pairs are formed by associating sum with all of its load inputs which are sum, a[i] and b[i]. Then check functions are used to filter a valid reduction pair ie {sum,sum}. --------- Co-authored-by: Michael Kruse <github@meinersbur.de>	2024-07-30 09:43:24 -07:00
Stephen Tozer	80f881485a	[LLVM] Add InsertPosition union-type to remove overloads of Instruction-creation (#94226 ) This patch simplifies instruction creation by replacing all overloads of instruction constructors/Create methods that are identical other than the Instruction InsertBefore/BasicBlock InsertAtEnd/BasicBlock::iterator InsertBefore argument with a single version that takes an InsertPosition argument. The InsertPosition class can be implicitly constructed from any of the above, internally converting them to the appropriate BasicBlock::iterator value which can then be used to insert the instruction (or to not insert it if an invalid iterator is passed). The upshot of this is that code will be deduplicated, and all callsites will switch to calling the new unified version without any changes needed to make the compiler happy. There is at least one exception to this; the construction of InsertPosition is a user-defined conversion, so any caller that was already relying on a different user-defined conversion won't work. In all of LLVM and Clang this happens exactly once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca with an AssertingVH<Instruction> argument, which must now be cast to an Instruction* by using `&*`. If this is more common elsewhere, it could be fixed by adding an appropriate constructor to InsertPosition.	2024-06-20 10:27:55 +01:00
Karthika Devi C	d33864d5d8	[polly] Fix cppcheck SA comment reported in #91235 (#93505 ) This patch moves the unreachable assert before return statement. Fixes #91235.	2024-05-28 11:41:58 -07:00
Karthika Devi C	601d7eab06	[polly] Add polly-debug flag to print debug info from all parts of polly (#78549 ) This flag enable the user to print debug Info from all the passes and helpers inside polly at once. This will help a novice user as well to work in polly without explicitly having to know which parts of polly has actually kicked in and pass them via -debug-only.	2024-03-26 12:02:27 -07:00
Brad Smith	12ed2c90a1	[llvm][NFC] A start at cleaning up zero byte files that should have been removed (#74404 )	2023-12-05 01:57:14 -05:00
Brad Smith	2fd66e6eb6	Revert "[lldb] A start at cleaning up zero byte files that should have been removed" This reverts commit 3223936dc512c9f4f87a230a4d2931e37186ca22. Commited by accident while mixed in with another commit.	2023-12-05 00:27:11 -05:00
Brad Smith	3223936dc5	[lldb] A start at cleaning up zero byte files that should have been removed	2023-12-04 23:12:54 -05:00
Fangrui Song	678e3ee123	[lldb] Fix duplicate word typos; NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 21:32:24 -07:00
Kazu Hirata	81e149aab9	Replace None with std::nullopt in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-05-12 18:33:26 -07:00
Michael Kruse	19afbfe331	[Polly] Remove Polly-ACC. Polly-ACC is unmaintained and since it has never been ported to the NPM pipeline, since D136621 it is not even accessible anymore without manually specifying the passes on the `opt` command line. Since there is no plan to put it to a maintainable state, remove it from Polly. Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D142580	2023-03-08 17:33:04 -06:00
Michael Kruse	42cd38c01e	[Polly] Remove -polly-vectorizer=polly. Polly's internal vectorizer is not well maintained and is known to not work in some cases such as region ScopStmts. Unlike LLVM's LoopVectorize pass it also does not have a target-dependent cost heuristics, and we recommend using LoopVectorize instead of -polly-vectorizer=polly. In the future we hope that Polly can collaborate better with LoopVectorize, like Polly marking a loop is safe to vectorize with a specific simd width, instead of replicating its functionality. Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D142640	2023-03-08 12:51:42 -06:00
Paul Walker	62d11b2cca	Revert "Revert "[SCEV] Add SCEVType to represent `vscale`."" Relanding after fixing Polly related build error. This reverts commit 7b26dcae9eaf8cdcba7fef032fd83d060dffd4b4.	2023-03-02 13:14:07 +00:00
Florian Hahn	934c82d318	[Polly] Remove CodegenCleanupPass. The pass uses a bunch of deprecated legacy passes and appears unused. Remove it to unblock removing legacy passes. Fixes https://github.com/llvm/llvm-project/issues/60852 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D144332	2023-02-24 13:39:32 +01:00
Kazu Hirata	ccdc271a08	[polly] Use std::optional instead of llvm::Optional (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-02 19:18:46 -08:00
Nikita Popov	e95ca5bb05	[AST] Make AliasSetTracker work on BatchAA D138014 restricted AST to work on immutable IR. This means it is also safe to use a single BatchAA instance for the entire AST lifetime, instead of only batching parts of individual queries. The primary motivation for this is not compile-time, but rather having a central place to control cross-iteration AA, which will be used by D137958. Differential Revision: https://reviews.llvm.org/D137955	2022-12-05 08:12:26 +01:00
Michael Kruse	b4b7fa234c	[Polly] Ensure -polly-detect-keep-going still eventually rejects invalid regions. Fixes #58484	2022-10-20 13:35:09 -05:00
Gabriel Ravier	ea540bc210	[polly] Fixed a number of typos. NFC I went over the output of the following mess of a command: `(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z \| parallel --xargs -0 cat \| aspell list --mode=none --ignore-case \| grep -E '^[A-Za-z][a-z]*$' \| sort \| uniq -c \| sort -n \| grep -vE '.{25}' \| aspell pipe -W3 \| grep : \| cut -d' ' -f2 \| less)` and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start). Reviewed By: inclyc Differential Revision: https://reviews.llvm.org/D131167	2022-08-07 22:56:07 +08:00
Roman Gareev	b02c7e2b63	[Polly] Generalize the pattern matching to the case of tensor contractions The pattern matching optimization of Polly detects and optimizes dense general matrix-matrix multiplication. The generated code is close to high performance implementations of matrix-matrix multiplications, which are contained in manually tuned libraries. The described pattern matching optimization is a particular case of tensor contraction optimization, which was introduced in [1]. This patch generalizes the pattern matching to the case of tensor contractions using the form of data dependencies and memory accesses produced by tensor contractions [1]. Optimization of tensor contractions will be added in the next patch. Following the ideas introduced in [2], it will logically represent tensor contraction operands as matrix multiplication operands and use an approach for optimization of matrix-matrix multiplications. [1] - Gareev R., Grosser T., Kruse M. High-Performance Generalized Tensor Operations: A Compiler-Oriented Approach // ACM Transactions on Architecture and Code Optimization (TACO). 2018. Vol. 15, no. 3. P. 34:1–34:27. DOI: 10.1145/3235029. [2] - Matthews D. High-Performance Tensor Contraction without BLAS // SIAM Journal on Scientific Computing. 2018. Vol. 40, no. 1. P. C 1—C 24. DOI: 110.1137/16m108968x. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D114336	2022-08-07 13:10:32 +03:00
Michael Kruse	fe0e5b3e43	[Polly] Insert !dbg metadata for emitted CallInsts. The IR Verifier requires that every call instruction to an inlineable function (among other things, its implementation must be visible in the translation unit) must also have !dbg metadata attached to it. When parallelizing, Polly emits calls to OpenMP runtime function out of thin air, or at least not directly derived from a bounded list of previous instruction. While we could search for instructions in the SCoP that has some debug info attached to it, there is no guarantee that we find any. Our solution is to generate a new DILocation that points to line 0 to represent optimized code. The OpenMP function implementation is usually not available in the user's translation unit, but can become visible in an LTO build. For the bug to appear, libomp must also be built with debug symbols. IMHO, the IR verifier rule is too strict. Runtime functions can also be inserted by other optimization passes, such as LoopIdiomRecognize. When inserting a call to e.g. memset, it uses the DebugLoc from a StoreInst from the unoptimized code. It is not required to have !dbg metadata attached either. Fixes #56692	2022-07-26 19:43:53 -05:00
Kazu Hirata	3f3930a451	Remove redundaunt virtual specifiers (NFC) Identified with tidy-modernize-use-override.	2022-07-25 23:00:59 -07:00
Kazu Hirata	360c1111e3	Use llvm::is_contained (NFC)	2022-07-20 09:09:19 -07:00
Michael Kruse	6fa65f8a98	[Polly][MatMul] Abandon dependence analysis. The copy statements inserted by the matrix-multiplication optimization introduce new dependencies between the copy statements and other statements. As a result, the DependenceInfo must be recomputed. Not recomputing them caused IslAstInfo to deduce that some loops are parallel but cause race conditions when accessing the packed arrays. As a result, matrix-matrix multiplication currently cannot be parallelized. Also see discussion at https://reviews.llvm.org/D125202	2022-06-29 17:20:05 -05:00
Arthur Eubanks	c80b88ee29	[polly] #include <algorithm> For the usage of std::max in the header. Speculative fix for https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8810806780048763729/overview reported in https://reviews.llvm.org/D125263.	2022-06-21 13:27:55 -07:00
Guillaume Chatelet	6e930503f4	[NFC][polly] Removed dead code	2022-06-13 07:50:35 +00:00
Yang Keao	02f640672e	[Polly] Migrate -polly-mse to the new pass manager. This patch implements the `MaximalStaticExpansion` and its printer in NPM. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D125870	2022-06-01 13:37:58 -05:00
Michael Kruse	bd93df937a	[Polly] Mark classes as final by default. NFC. This make is obivious that a class was not intended to be derived from. NPM analysis pass can unfortunately not marked as final because they are derived from a llvm::Checker<T> template internally by the NPM. Also normalize the use of classes/structs * NPM passes are structs * Legacy passes are classes * structs that have methods and are not a visitor pattern are classes * structs have public inheritance by default, remove "public" keyword * Use typedef'ed type instead of inline forward declaration	2022-05-17 12:05:39 -05:00
Simon Pilgrim	3cc2c7deed	[polly] Remove 'using namespace llvm/polly' from ScopGraphPrinter.h header. As mentioned on D123678 this appears to be causing namespace resolution issues on some versions of gcc.	2022-05-16 16:19:03 +01:00
Michael Kruse	6b3b87376b	[polly] migrate -polly-show to the new pass manager Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123678	2022-05-09 14:04:29 -05:00
Michael Kruse	5c02808131	[polly] Introduce -polly-print-* passes to replace -analyze. The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests. There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D120782	2022-03-14 10:27:15 -05:00
Michael Kruse	ad84c6f657	[polly] Match function definitions and header declarations. NFC. Ensure that function definitions match their declrations in header files, even if they have no effect on linking. This includes 1. Both have the same __isl_* annotations 2. Both use the same type alias 3. Remove unused declarations that have no definition 4. Use explicit polly namespace qualifier for definitions; generally, the .cpp file should use at most an anon namespace region since only symbols declared in the header file can be accessed from other translation units anyway. For defintions that have been declared in the header file, the explicit namespace qualifier ensures that both match.	2022-02-16 12:52:17 -06:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Roman Lebedev	82fb4f4b22	[SCEV] Sequential/in-order `UMin` expression As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692, SCEV is forbidden from reasoning about 'backedge taken count' if the branch condition is a poison-safe logical operation, which is conservatively correct, but is severely limiting. Instead, we should have a way to express those poison blocking properties in SCEV expressions. The proposed semantics is: ``` Sequential/in-order min/max SCEV expressions are non-commutative variants of commutative min/max SCEV expressions. If none of their operands are poison, then they are functionally equivalent, otherwise, if the operand that represents the saturation point* of given expression, comes before the first poison operand, then the whole expression is not poison, but is said saturation point. ``` * saturation point - the maximal/minimal possible integer value for the given type The lowering is straight-forward: ``` compare each operand to the saturation point, perform sequential in-order logical-or (poison-safe!) ordered reduction over those checks, and if reduction returned true then return saturation point else return the naive min/max reduction over the operands ``` https://alive2.llvm.org/ce/z/Q7jxvH (2 ops) https://alive2.llvm.org/ce/z/QCRrhk (3 ops) Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97 That allows us to handle the patterns in question. Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D116766	2022-01-10 20:51:26 +03:00
Kazu Hirata	b12fd13812	Fix bugprone argument comments. Identified by bugprone-argument-comment.	2022-01-09 12:21:02 -08:00
Kazu Hirata	fb7cf90071	Use nullptr instead of 0 or NULL (NFC) Identified with modernize-use-nullptr.	2022-01-07 10:17:29 -08:00
Michael Kruse	937b00ab2c	[Polly][SchedOpt] Account for prevectorization of multiple statements. A prevectorized loop may contain multiple statements, in which case isl_schedule_node_band_sink will sink the vector band to multiple leaves. Instead of statically assuming a specific tree structure after sinking, add a SIMD marker to all inner bands. Fixes llvm.org/PR52637	2021-12-23 14:06:41 -06:00
Riccardo Mori	44596fe6a9	[Polly][Isl] Use the function unsignedFromIslSize to manage a isl::size object. NFCI This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface. In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`. For this reason two helping functions have been added: - `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds - `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned` Changes made: - Add the functions `IslAssert` and `unsignedFromIslSize` - Add the utility function `rangeIslSize()` - Retype `MaxDisjunctsInDomain` from `int` to `unsigned` - Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned` - Retype `MaxDimensionsInAccessRange` from `int` to `unsigned` - Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore - `isl-noexceptions.h` has been generated by `e704f73c88` No functional change intended. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D113101	2021-11-05 11:15:22 +01:00
Michael Kruse	19db33c06e	[Polly] Remove support for code generated by gfortran+DragonEgg. DragonEgg is not maintained anymore, hence there is no need for this functionality. Fixes llvm.org/PR52173	2021-10-14 14:12:06 -05:00
Michael Kruse	ec2029f986	[Polly] Do not inline dumpIslObj methods. NFC. Instead of being inline and having a neverCalled() workaround to make it work in the debugger, define it as a regular exported function. Also add overloads for the C API types isl_* so it works with managed as well as unmanaged ISL objects.	2021-10-12 23:52:36 -05:00
Michael Kruse	64489255be	[Polly] Add greedy fusion algorithm. When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer tries to aggressively fuse any band it can and does not violate any dependences. As part if the implementation, the functionalty for copying a band into an new schedule was extracted out of the ScheduleTreeRewriter.	2021-10-08 20:33:30 -05:00
Michael Kruse	027c036663	[Polly] Reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964 Recommit with "REQUIRES: asserts" in test that uses statistics.	2021-09-27 18:49:11 -05:00
Haowei Wu	283ed7de32	Revert "[Polly] Reject reject regions entered by an indirectbr/callbr." This reverts commit 91f46bb77e6d56955c3b96e9e844ae6a251c41e9 which causes test failures when assertions are off.	2021-09-27 16:05:33 -07:00
Michael Kruse	91f46bb77e	[Polly] Reject reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964	2021-09-26 21:21:50 -05:00
Michael Kruse	d5c87162db	[Polly] Use VirtualUse to determine references. VirtualUse ensures consistency over different source of values with Polly. In particular, this enables its use of instructions moved between Statement. Before the patch, the code wrongly assumed that the BB's instructions are also the ScopStmt's instructions. Reference are determined for OpenMP outlining and GPGPU kernel extraction. GPGPU CodeGen had some problems. For one, it generated GPU kernel parameters for constants. Second, it emitted GPU-side invariant loads which have already been loaded by the host. This has been partially fixed, it still generates a store for the invariant load result, but using the value that the host has already written. WARNING: I did not test the generated PollyACC code on an actual GPU. The improved consistency will be made use of in the next patch.	2021-09-26 03:26:43 -05:00
Michael Kruse	1cea25eec9	[Polly] Remove isConstCall. The function was intended to catch OpenMP functions such as get_thread_id(). If matched, the call would be considered synthesizable. There were a few problems with this: * get_thread_id() is not 'const' in the sense of have the gcc manual defines it: "do not examine any values except their arguments". get_thread_id() reads OpenCL runtime libreary global state. What was inteded was probably 'speculable'. * isConstCall was implemented using mayReadOrWriteMemory(). 'const' is stricter than that, mayReadOrWriteMemory is e.g. true for malloc(), since it may only read/write addresses that are considered inaccessible fro the application. However, malloc is certainly not speculable. * Values that are isConstCall were not handled consistently throughout Polly. In particular, it was not considered for referenced values (OpenMP outlining and PollyACC). Fix by removing special handling for isConstCall entirely.	2021-09-26 03:26:43 -05:00
Michael Kruse	e470f9268a	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass. Re-apply after revert in c7bcd72a38bcf99e03e4651ed5204d1a1f2bf695 with fix: Take isBand out of #ifndef NDEBUG since it now is used unconditionally.	2021-09-23 21:11:01 -05:00
Petr Hosek	c7bcd72a38	Revert "[Polly] Implement user-directed loop distribution/fission." This reverts commit 52c30adc7dfe6334b71adf256d81f70e7b976143 which breaks the build when NDEBUG is defined.	2021-09-23 14:04:25 -07:00
Michael Kruse	52c30adc7d	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass.	2021-09-22 17:28:25 -05:00
Michael Kruse	cad9f98a2a	[Polly] Don't generate inter-iteration noalias metadata. This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing. Rhe implemention had multiple issues: * The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then. * This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating. * Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with. * Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance. * The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well. A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well. This effectively reverts D30606 and D35761.	2021-09-20 22:20:17 -05:00
Michael Kruse	c62d9a5ca0	[Polly] Use subtyped isl::schedule_nodes for ScheduleTreeVisitor. NFC. Change pass-by-const-ref to pass-by-value as objects are recreated due to custom up-/down-casting anwyway.	2021-08-31 20:54:12 -05:00

1 2 3 4 5 ...

1207 Commits