The patch moves out of SCEV's scope so they can be re-used for SCEVUse.
SCEVUse gets an additional getNoWrapFlags helper that returns the union
of the expressions SCEV flags and the use-specific flags.
SCEVExpander has been updated to use this new helper.
In order to avoid other changes, the original names are exposed via
constexpr in SCEV. Not sure if there's a nicer way. One alternative
would be to define the enum in struct, and have SCEV inherit from it.
The patch also clarifies that the SCEVUse flags encode NUW/NSW, and
hides getInt, setInt, etc to avoid potential mis-use.
PR: https://github.com/llvm/llvm-project/pull/190199
#184545 default-enables the IO sandbox in assert-builds. This causes
Clang using Polly to crash (#188568).
The issue is that `PassBuilder` uses `vfs::getRealFileSystem()` by
default which is considered a IO sandbox violation in the Clang process.
With this PR store the VFS from the `PassBuilder` from the original
`registerPollyPasses` call for creating other `PassBuilder` instances.
This PR also adds infrastructure for running Polly in `clang` (in
addition in `opt`). `opt` does not enable the sandbox such that we need
separate tests using Clang.
Closes: #188568
`DT` is always the analysis for the to-be-optimized function while
`GenDT` is the analysis of the function that we currently generate code
for, which can also be an outlined function. Here, we want to check
dominance in the generated code, hence we must use `GenDT`.
#179433 already fixed the same issue for `BlockGenerator`. The same
pattern is used in `RegionGenerator` which is fixed here. A good
argument to avoid code duplication.
Fixes: #185313
Thanks to @jaschiu for the bug report and reproducer
Update isl to include
https://repo.or.cz/isl.git/commit/fc484e004200964f8f18249de1f510393ec924a9
which fixes#180000.
The isl update also fixes#34710 which had the same cause but with an
empty access domain (#180000 has an empty statement domain). Thus we
also revert 163cacb46960be4dd0d8562737bbf0ea97cb14ad which now only adds
unnecessary overhead.
A regression test has been added to isl which is why we do not add a
test in Polly.
Fixes: #180000
Thanks @skimo-openhub for the fix and @thapgua for the bugreport.
`DT` is always the analysis for the to-be-optimized function while
`GenDT` is the analysis of the function that we currently generate code
for which can also be an outlined function. Here, we want to check
dominance in the generated code, hence we must use `GenDT`.
Fixes: #179135
Fixes: #177527
Updated test cases:
* CodeGen/OpenMP/matmul-parallel.ll, ScheduleOptimizer/pattern-matching-based-opts.ll
Before the update, ISL bailed out the dependency computation due to
hitting the max operation limit. The commit
https://repo.or.cz/isl.git/commit/4bdfe2567715c5d1a8287c07d8685eb3db281e32
seems to have reduced the complexity needed of the dependency
computation, thus now being able to recognize some loops as parallel.
The tests were checking that the outer loop is not parallel, but some
inner loops can be parallized, particularly the array packing loops.
* DeLICM/reduction_looprotate_hoisted.ll
changes in how isl generates expressions
* ScheduleOptimizer/pattern-matching-based-opts_5.ll
changes in how isl generates expressions, and AST node changes
PR #125442 replaces the pass-based Polly architecture with a monolithic
pass consisting of phases. Reasons listed in
https://github.com/llvm/llvm-project/pull/125442.
With this change, the SCoP-passes became redundant problematic versions
of the same functionality and are removed.
Reapply of a22d1c2225543aa9ae7882f6b1a97ee7b2c95574. Using this PR for
pre-merge CI.
Instead of relying on any pass manager to schedule Polly's passes, add
Polly's own pipeline manager which is seen as a monolithic pass in
LLVM's pass manager. Polly's former passes are now phases of the new
PhaseManager component.
Relying on LLVM's pass manager (the legacy as well as the New Pass
Manager) to manage Polly's phases never was a good fit that the
PhaseManager resolves:
* Polly passes were modifying analysis results, in particular RegionInfo
and ScopInfo. This means that there was not just one unique and
"definite" analysis result, the actual result depended on which analyses
ran prior, and the pass manager was not allowed to throw away cached
analyses or prior SCoP optimizations would have been forgotten. The LLVM
pass manger's persistance of analysis results is not contractual but
designed for caching.
* Polly depends on a particular execution order of passes and regions
(e.g. regression tests, invalidation of consecutive SCoPs). LLVM's pass
manager does not guarantee any excecution order.
* Polly does not completely preserve DominatorTree, RegionInfo,
LoopInfo, or ScalarEvolution, but only as-needed for Polly's own uses.
Because the ScopDetection object stores references to those analyses, it
still had to lie to the pass manager that they would be preserved, or
the pass manager would have released and recomputed the invalidated
analysis objects that ScopDetection/ScopInfo was still referencing. To
ensure that no non-Polly pass would see these not-completely-preserved
analyses, all analyses still had to be thrown away after the
ScopPassManager, respectively with a BarrierNoopPass in case of the LPM.
* The NPM's PassInstrumentation wraps the IR unit into an `llvm::Any`
object, but implementations such as PrintIRInstrumentation call
llvm_unreachable on encountering an unknown IR unit, such as SCoPs, with
no extension points to add support. Hence LLVM crashes when dumping IR
between SCoP passes (such as `-print-before-changed` with Polly being
active).
The new PhaseManager uses some command line options that previously
belonged to Polly's legacy passes, such as `-polly-print-detect` (so the
option will continue to work). Hence the LPM support is incompatible
with the new approach and support for it is removed.
When Polly generates a false runtime condition (RTC), the associated
Polly generated loop is never executed and is eventually eliminated. As
a result, the fallback loop becomes the default execution path.
Disabling vectorization for this fallback loop will be
counterproductive. This patch ensures that vectorization is only
disabled when the RTC is not false (no Codegen failure).
Instead of relying on any pass manager to schedule Polly's passes, add
Polly's own pipeline manager which is seen as a monolithic pass in
LLVM's pass manager. Polly's former passes are now phases of the new
PhaseManager component.
Relying on LLVM's pass manager (the legacy as well as the New Pass
Manager) to manage Polly's phases never was a good fit that the
PhaseManager resolves:
* Polly passes were modifying analysis results, in particular RegionInfo
and ScopInfo. This means that there was not just one unique and
"definite" analysis result, the actual result depended on which analyses
ran prior, and the pass manager was not allowed to throw away cached
analyses or prior SCoP optimizations would have been forgotten. The LLVM
pass manger's persistance of analysis results is not contractual but
designed for caching.
* Polly depends on a particular execution order of passes and regions
(e.g. regression tests, invalidation of consecutive SCoPs). LLVM's pass
manager does not guarantee any excecution order.
* Polly does not completely preserve DominatorTree, RegionInfo,
LoopInfo, or ScalarEvolution, but only as-needed for Polly's own uses.
Because the ScopDetection object stores references to those analyses, it
still had to lie to the pass manager that they would be preserved, or
the pass manager would have released and recomputed the invalidated
analysis objects that ScopDetection/ScopInfo was still referencing. To
ensure that no non-Polly pass would see these not-completely-preserved
analyses, all analyses still had to be thrown away after the
ScopPassManager, respectively with a BarrierNoopPass in case of the LPM.
* The NPM's PassInstrumentation wraps the IR unit into an `llvm::Any`
object, but implementations such as PrintIRInstrumentation call
llvm_unreachable on encountering an unknown IR unit, such as SCoPs, with
no extension points to add support. Hence LLVM crashes when dumping IR
between SCoP passes (such as `-print-before-changed` with Polly being
active).
The new PhaseManager uses some command line options that previously
belonged to Polly's legacy passes, such as `-polly-print-detect` (so the
option will continue to work). Hence the LPM support is incompatible
with the new approach and support for it is removed.
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.
However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.
* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.
I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.
This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)
In practice, I've only found a few places that currently produce
lifetimes on non-allocas:
* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
Patch created using the following command line:
```bash
codespell polly --skip="*.pdf,polly/lib/External/*" --write-changes \
--ignore-words-list=couter,createor,distribues,doble,identty,indention,indx,olt,ore,padd,sais,te,theses
```
This patch introduces the initial implementation for annotating loops
created by Polly. Polly generates RunTimeChecks (RTCs), which result in
loop versioning. Specifically, the loop created by Polly is executed
when the RTCs pass, otherwise, the original loop is executed.
This patch adds the "llvm.loop.vectorize.enable" metadata, setting it to
true for loops created by Polly. Disabling vectorization for the original
fallback loop is already merged in #119188.
This behavior is controlled by the 'polly-annotate-metadata-vectorize'
flag, and the annotations are applied only when this flag is enabled.
This flag is set to false by default.
NOTE: This commit is initial patch in effort to make polly interact with
Loop Vectorizer via metadata.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
The patch #102460 already implements separate DT/LI/SE for parallel sub
function. Crashes have been reported while region generator tries using
oringinal function's DT while creating new parallel sub function due to
checks in #101198. This patch aims at fixing those cases by switching
the DT/LI while generating parallel function using Region Generator.
Fixes#117877
The patch sets the vectorization metadata to false for Polly's fallback
loops. These are the loops executed when RTCs fail. This minimizes the
multiple loop versioning carried out by Polly and subsequently by the
Loop Vectorizer.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
Even as the NPM has been in use by Polly for a while now, the majority
of the tests continue using the LPM passes. This patch ports the tests
to use the NPM passes (for example, by replacing a flag such as
-polly-detect with -passes=polly-detect following the NPM syntax for
specifying passes) with some exceptions for some missing features in the
new passes.
Relanding #90632.
Even as the NPM has been in use by Polly for a while now, the
majority of the tests continue using the LPM passes. This patch
ports the tests to use the NPM passes (for example, by replacing
a flag such as -polly-detect with -passes=polly-detect following
the NPM syntax for specifying passes) with some exceptions for
some missing features in the new passes. Additionally, the lit
substitution %loadPolly is replaced by the substitution of what
was %loadNPMPolly and %loadNPMPolly is removed.
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
Before this patch, we can only use the MaxBECount for an AddRec's range
computation if the MaxBECount has <= bit width of the AddRec. This patch
reasons that if a MaxBECount has > bit width, and is <= the max value of
AddRec's bit width, we can still use the MaxBECount.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151698
Polly's internal vectorizer is not well maintained and is known to not work in some cases such as region ScopStmts. Unlike LLVM's LoopVectorize pass it also does not have a target-dependent cost heuristics, and we recommend using LoopVectorize instead of -polly-vectorizer=polly.
In the future we hope that Polly can collaborate better with LoopVectorize, like Polly marking a loop is safe to vectorize with a specific simd width, instead of replicating its functionality.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142640
IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.
The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.
Differential Revision: https://reviews.llvm.org/D141912
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Reviewed By: inclyc
Differential Revision: https://reviews.llvm.org/D131167
The IR Verifier requires that every call instruction to an inlineable
function (among other things, its implementation must be visible in the
translation unit) must also have !dbg metadata attached to it. When
parallelizing, Polly emits calls to OpenMP runtime function out of thin
air, or at least not directly derived from a bounded list of previous
instruction. While we could search for instructions in the SCoP that has
some debug info attached to it, there is no guarantee that we find any.
Our solution is to generate a new DILocation that points to line 0 to
represent optimized code.
The OpenMP function implementation is usually not available in the
user's translation unit, but can become visible in an LTO build. For
the bug to appear, libomp must also be built with debug symbols.
IMHO, the IR verifier rule is too strict. Runtime functions can
also be inserted by other optimization passes, such as
LoopIdiomRecognize. When inserting a call to e.g. memset, it uses the
DebugLoc from a StoreInst from the unoptimized code. It is not
required to have !dbg metadata attached either.
Fixes#56692
The copy statements inserted by the matrix-multiplication optimization
introduce new dependencies between the copy statements and other
statements. As a result, the DependenceInfo must be recomputed.
Not recomputing them caused IslAstInfo to deduce that some loops are
parallel but cause race conditions when accessing the packed arrays.
As a result, matrix-matrix multiplication currently cannot be
parallelized.
Also see discussion at https://reviews.llvm.org/D125202
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
As mentioned in D120782, the loop block order can be different depending
on if LoopInfo is incrementally updated or freshly computed.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D122195
The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests.
There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D120782