Replace the DenseMap from blocks to their innermost loop a vector
indexed by block numbers, when possible. Supporting number updates is
not trivial as we don't store a list of basic blocks, so this is not
implemented.
NB: I'm generally not happy with the way loops are stored. As I think
that there's room for improvement, I don't want to touch the
representation at this point.
Pull Request: https://github.com/llvm/llvm-project/pull/103400
There's no point constructing a dominator tree or similar on
known-broken IR. Generally, functions should be able to assume that IR
is valid (i.e., passes the verifier). Users of this "feature" were:
- Verifier, fixed by verifying existence of terminators first.
- FuzzMutate, worked around by temporarily inserting terminators.
- OpenMP to run analyses while building the IR, worked around by
temporarily inserting terminators.
- Polly to work with an empty dominator tree, fixed by temporarily
adding an unreachable inst.
- MergeBlockIntoPredecessor, inadvertently, fixed by adding terminator
before updating MemorySSA.
- Some sloppily written unit tests.
`DT` is always the analysis for the to-be-optimized function while
`GenDT` is the analysis of the function that we currently generate code
for, which can also be an outlined function. Here, we want to check
dominance in the generated code, hence we must use `GenDT`.
#179433 already fixed the same issue for `BlockGenerator`. The same
pattern is used in `RegionGenerator` which is fixed here. A good
argument to avoid code duplication.
Fixes: #185313
Thanks to @jaschiu for the bug report and reproducer
Update isl to include
https://repo.or.cz/isl.git/commit/fc484e004200964f8f18249de1f510393ec924a9
which fixes#180000.
The isl update also fixes#34710 which had the same cause but with an
empty access domain (#180000 has an empty statement domain). Thus we
also revert 163cacb46960be4dd0d8562737bbf0ea97cb14ad which now only adds
unnecessary overhead.
A regression test has been added to isl which is why we do not add a
test in Polly.
Fixes: #180000
Thanks @skimo-openhub for the fix and @thapgua for the bugreport.
`DT` is always the analysis for the to-be-optimized function while
`GenDT` is the analysis of the function that we currently generate code
for which can also be an outlined function. Here, we want to check
dominance in the generated code, hence we must use `GenDT`.
Fixes: #179135
PR #125442 replaces the pass-based Polly architecture with a monolithic
pass consisting of phases. Reasons listed in
https://github.com/llvm/llvm-project/pull/125442.
With this change, the SCoP-passes became redundant problematic versions
of the same functionality and are removed.
Reapply of a22d1c2225543aa9ae7882f6b1a97ee7b2c95574. Using this PR for
pre-merge CI.
Instead of relying on any pass manager to schedule Polly's passes, add
Polly's own pipeline manager which is seen as a monolithic pass in
LLVM's pass manager. Polly's former passes are now phases of the new
PhaseManager component.
Relying on LLVM's pass manager (the legacy as well as the New Pass
Manager) to manage Polly's phases never was a good fit that the
PhaseManager resolves:
* Polly passes were modifying analysis results, in particular RegionInfo
and ScopInfo. This means that there was not just one unique and
"definite" analysis result, the actual result depended on which analyses
ran prior, and the pass manager was not allowed to throw away cached
analyses or prior SCoP optimizations would have been forgotten. The LLVM
pass manger's persistance of analysis results is not contractual but
designed for caching.
* Polly depends on a particular execution order of passes and regions
(e.g. regression tests, invalidation of consecutive SCoPs). LLVM's pass
manager does not guarantee any excecution order.
* Polly does not completely preserve DominatorTree, RegionInfo,
LoopInfo, or ScalarEvolution, but only as-needed for Polly's own uses.
Because the ScopDetection object stores references to those analyses, it
still had to lie to the pass manager that they would be preserved, or
the pass manager would have released and recomputed the invalidated
analysis objects that ScopDetection/ScopInfo was still referencing. To
ensure that no non-Polly pass would see these not-completely-preserved
analyses, all analyses still had to be thrown away after the
ScopPassManager, respectively with a BarrierNoopPass in case of the LPM.
* The NPM's PassInstrumentation wraps the IR unit into an `llvm::Any`
object, but implementations such as PrintIRInstrumentation call
llvm_unreachable on encountering an unknown IR unit, such as SCoPs, with
no extension points to add support. Hence LLVM crashes when dumping IR
between SCoP passes (such as `-print-before-changed` with Polly being
active).
The new PhaseManager uses some command line options that previously
belonged to Polly's legacy passes, such as `-polly-print-detect` (so the
option will continue to work). Hence the LPM support is incompatible
with the new approach and support for it is removed.
When Polly generates a false runtime condition (RTC), the associated
Polly generated loop is never executed and is eventually eliminated. As
a result, the fallback loop becomes the default execution path.
Disabling vectorization for this fallback loop will be
counterproductive. This patch ensures that vectorization is only
disabled when the RTC is not false (no Codegen failure).
Instead of relying on any pass manager to schedule Polly's passes, add
Polly's own pipeline manager which is seen as a monolithic pass in
LLVM's pass manager. Polly's former passes are now phases of the new
PhaseManager component.
Relying on LLVM's pass manager (the legacy as well as the New Pass
Manager) to manage Polly's phases never was a good fit that the
PhaseManager resolves:
* Polly passes were modifying analysis results, in particular RegionInfo
and ScopInfo. This means that there was not just one unique and
"definite" analysis result, the actual result depended on which analyses
ran prior, and the pass manager was not allowed to throw away cached
analyses or prior SCoP optimizations would have been forgotten. The LLVM
pass manger's persistance of analysis results is not contractual but
designed for caching.
* Polly depends on a particular execution order of passes and regions
(e.g. regression tests, invalidation of consecutive SCoPs). LLVM's pass
manager does not guarantee any excecution order.
* Polly does not completely preserve DominatorTree, RegionInfo,
LoopInfo, or ScalarEvolution, but only as-needed for Polly's own uses.
Because the ScopDetection object stores references to those analyses, it
still had to lie to the pass manager that they would be preserved, or
the pass manager would have released and recomputed the invalidated
analysis objects that ScopDetection/ScopInfo was still referencing. To
ensure that no non-Polly pass would see these not-completely-preserved
analyses, all analyses still had to be thrown away after the
ScopPassManager, respectively with a BarrierNoopPass in case of the LPM.
* The NPM's PassInstrumentation wraps the IR unit into an `llvm::Any`
object, but implementations such as PrintIRInstrumentation call
llvm_unreachable on encountering an unknown IR unit, such as SCoPs, with
no extension points to add support. Hence LLVM crashes when dumping IR
between SCoP passes (such as `-print-before-changed` with Polly being
active).
The new PhaseManager uses some command line options that previously
belonged to Polly's legacy passes, such as `-polly-print-detect` (so the
option will continue to work). Hence the LPM support is incompatible
with the new approach and support for it is removed.
Some of the changes in the patch include:
1. Using iterators instead of instruction pointers when applicable.
2. Modifying Polly functions to accept iterators instead of inst
pointers.
3. Updating API usages such as use begin instead of front.
As part of the effort to transition to using Debug Records instead of
Debug intrinsics, some API/argument changes are necessary to achieve the
desired behavior from Debug Records. This particular fix involves
passing iterators instead of instruction pointers to the SetInsertPoint
function. While this is crucial in certain areas, it may be more than
needed in others, but it does not cause any harm.
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
If the preload condition is a constant, ExprBuilder::create returns an
integer of the native integer while an i1 is expected. Cast the result
to i1 if that happens.
Fixes#123932
Patch created using the following command line:
```bash
codespell polly --skip="*.pdf,polly/lib/External/*" --write-changes \
--ignore-words-list=couter,createor,distribues,doble,identty,indention,indx,olt,ore,padd,sais,te,theses
```
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
This patch introduces the initial implementation for annotating loops
created by Polly. Polly generates RunTimeChecks (RTCs), which result in
loop versioning. Specifically, the loop created by Polly is executed
when the RTCs pass, otherwise, the original loop is executed.
This patch adds the "llvm.loop.vectorize.enable" metadata, setting it to
true for loops created by Polly. Disabling vectorization for the original
fallback loop is already merged in #119188.
This behavior is controlled by the 'polly-annotate-metadata-vectorize'
flag, and the annotations are applied only when this flag is enabled.
This flag is set to false by default.
NOTE: This commit is initial patch in effort to make polly interact with
Loop Vectorizer via metadata.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
The patch #102460 already implements separate DT/LI/SE for parallel sub
function. Crashes have been reported while region generator tries using
oringinal function's DT while creating new parallel sub function due to
checks in #101198. This patch aims at fixing those cases by switching
the DT/LI while generating parallel function using Region Generator.
Fixes#117877
The patch sets the vectorization metadata to false for Polly's fallback
loops. These are the loops executed when RTCs fail. This minimizes the
multiple loop versioning carried out by Polly and subsequently by the
Loop Vectorizer.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
DominatorTree, LoopInfo, and ScalarEvolution are function-level analyses
that expect to be called only on instructions and basic blocks of the
function they were original created for. When Polly outlined a parallel
loop body into a separate function, it reused the same analyses seemed
to work until new checks to be added in #101198.
This patch creates new analyses for the subfunctions. GenDT, GenLI, and
GenSE now refer to the analyses of the current region of code. Outside
of an outlined function, they refer to the same analysis as used for the
SCoP, but are substituted within an outlined function.
Additionally to the cross-function queries of DT/LI/SE, we must not
create SCEVs that refer to a mix of expressions for old and generated
values. Currently, SCEVs themselves do not "remember" which
ScalarEvolution analysis they were created for, but mixing them is just
as unexpected as using DT/LI across function boundaries. Hence
`SCEVLoopAddRecRewriter` was combined into `ScopExpander`.
`SCEVLoopAddRecRewriter` only replaced induction variables but left
SCEVUnknowns to reference the old function. `SCEVParameterRewriter`
would have done so but its job was effectively superseded by
`ScopExpander`, and now also `SCEVLoopAddRecRewriter`. Some issues
persist put marked with a FIXME in the code. Changing them would
possibly cause this patch to be not NFC anymore.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
This flag enable the user to print debug Info from all the passes and
helpers inside polly at once. This will help a novice user as well to
work in polly without explicitly having to know which parts of polly has
actually kicked in and pass them via -debug-only.
These are the last remaining "trivial" changes to passes that use
Instruction pointers for insertion. All of this should be NFC, it's just
changing the spelling of how we identify a position.
In one or two locations, I'm also switching uses of getNextNode etc to
using std::next with iterators. This too should be NFC.
---------
Merged by: Stephen Tozer <stephen.tozer@sony.com>
It's becoming potentially unsafe to insert a PHI instruction using a plain
Instruction pointer. Switch all the remaining sites that create and insert
PHIs to use iterators instead. For example, the code in
ComplexDeinterleavingPass.cpp is definitely at-risk of mixing PHIs and
debug-info.
This removes `CreateMalloc` from `CallInst` and adds it to the `IRBuilderBase`
class.
We no longer needed the `Instruction *InsertBefore` and
`BasicBlock *InsertAtEnd` arguments of the `createMalloc` helper
function because we're using `IRBuilder` now. That's why I we also don't
need 4 `CreateMalloc` functions, but only two.
Differential Revision: https://reviews.llvm.org/D158861
The method is marked for deprecation. Delete the method and move all of
its consumers to use the DomTreeUpdater version.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D149428
Polly-ACC is unmaintained and since it has never been ported to the NPM pipeline, since D136621 it is not even accessible anymore without manually specifying the passes on the `opt` command line.
Since there is no plan to put it to a maintainable state, remove it from Polly.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142580