This extends the work from 7755c26 to all of the different backend
taken count kinds that we print for the scev analysis printer. As
before, the goal is to cut down on confusion as i4 -1 is a very
different (unsigned) value from i32 -1.
When printing the result of SCEV's analysis, we can avoid printing
the predicated backedge taken count and the predicates if the predicates
are empty and no new information is provided. This helps to reduce the
verbosity of the output.
When printing the result of the analysis, i8 -1 and i64 -1 are quite
different in terms of analysis quality. In a recent conversion with
a new contributor, we ran into exactly this confusion.
Adding the type for constant scevs more globally seems worthwhile, but
introduces a much larger test diff. I'm splitting this off first since
it addresses the immediate need, and then going to do some further
changes to clarify a few related bits of analysis result output.
When a SCEVCallbackVH is RAUWed, we currently do a def-use walk and
remove dependent instructions from the ValueExprMap. However, unlike
SCEVs usual invalidation, this does not forget memoized values.
The end result is that we might end up removing a SCEVUnknown from the
map, while that expression still has users. Due to that, we may later
fail to invalide those expressions. In particular, invalidation of loop
dispositions only does something if there is an expression for the
value, which would not be the case here.
Fix this by using the standard forgetValue() API, instead of rolling a
custom variant.
Fixes https://github.com/llvm/llvm-project/issues/68285.
This adds code to the loop rotation transformation to ensure that the
computed block execution counts for the loop bodies are the same before
and after the transformation. This isn't always true in practice, but I
believe this is because of numeric inaccuracies in the BlockFrequency
computation.
The invariants this is modeled on and heuristic choice of 0-trip loop
amount is explained in a lenghty comment in the new
`updateBranchWeights()` function.
Differential Revision: https://reviews.llvm.org/D157462
Set phi inputs to poison whenever we find a dead edge (either
during initial worklist population or the main InstCombine run),
instead of only doing this for successors of dead blocks.
This means that the phi operand is set to poison even if for
critical edges without an intermediate block.
There are quite a few test changes, because the pattern is fairly
common in vectorizer output, for cases where we know the vectorized
loop will be entered.
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.
Adjusts test for above.
Fixes#55416.
Differential Revision: https://reviews.llvm.org/D125574
This is a patch that disables the poison-unsafe select -> and/or i1 folding.
It has been blocking D72396 and also has been the source of a few miscompilations
described in llvm.org/pr49688 .
D99674 conditionally blocked this folding and successfully fixed the latter one.
The former one was still blocked, and this patch addresses it.
Note that a few test functions that has `_logical` suffix are now deoptimized.
These are created by @nikic to check the impact of disabling this optimization
by copying existing original functions and replacing and/or with select.
I can see that most of these are poison-unsafe; they can be revived by introducing
freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll)
because I think they are good targets for freeze fix.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101191
This is a small patch to make FoldBranchToCommonDest poison-safe by default.
After fc3f0c9c, only two syntactic changes are needed to fix unit tests.
This does not cause any assembly difference in testsuite as well (-O3, X86-64 Manjaro).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99452
2nd try (original: 27ae17a6b014) with fix/test for crash. We must make
sure that TTI is available before trying to use it because it is not
required (might be another bug).
Original commit message:
This is one step towards solving:
https://llvm.org/PR49336
In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.
Differential Revision: https://reviews.llvm.org/D98898
This commit copies existing tests at llvm/Transforms containing
'shufflevector X, undef' and replaces them with 'shufflevector X, poison'.
The new copied tests have *-inseltpoison.ll suffix at its file name
(as db7a2f347f132b3920415013d62d1adfb18d8d58 did)
See https://reviews.llvm.org/D93793
Test files listed using
grep -R -E "^[^;]*shufflevector <.*> .*, <.*> undef" | cut -d":" -f1 | uniq
Test files copied & updated using
file_org=llvm/test/Transforms/$1
if [[ "$file_org" = *-inseltpoison.ll ]]; then
file=$file_org
else
file=${file_org%.ll}-inseltpoison.ll
if [ ! -f $file ]; then
cp $file_org $file
fi
fi
sed -i -E 's/^([^;]*)shufflevector <(.*)> (.*), <(.*)> undef/\1shufflevector <\2> \3, <\4> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
echo "$file : should be manually updated"
# The test is manually updated
exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
Alignment attributes need to be dropped for non-pointer values.
This also introduces a check into the verifier to ensure you don't use
`align` on anything but a pointer. Test needed to be adjusted
accordingly.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D87304
Summary:
This patch resolves an issue where the metadata of a loop is not added to the
new loop latch, and not removed from the old loop latch. This issue occurs in
the SplitBlockPredecessors function, which adds a new block in a loop, and
in the case that the block passed into this function is the header of the loop,
the loop can be modified such that the latch of the loop is replaced.
This patch applies to the Loop Simplify pass since it ensures that each loop
has exit blocks which only have predecessors that are inside of the loop. In
the case that this is not true, the pass will create a new exit block for the
loop. This guarantees that the loop preheader/header will dominate the exit blocks.
Author: sidbav (Sidharth Baveja)
Reviewers: asbirlea (Alina Sbirlea), chandlerc (Chandler Carruth), Whitney (Whitney Tsang), bmahjour (Bardia Mahjour)
Reviewed By: asbirlea (Alina Sbirlea)
Subscribers: hiraditya (Aditya Kumar), llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D83869
Summary:
When a loop has multiple backedges, loop simplification attempts to
separate them out into nested loops. This results in incorrect control
flow in the presence of some functions like a GPU barrier. This change
skips the transformation when such "convergent" function calls are
present in the loop body.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D80078
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
I have set up a separate review D61933 for a fix which is required for this patch.
Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse
Reviewed By: hfinkel, jmorse
Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D60831
> llvm-svn: 363046
llvm-svn: 363786
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
I have set up a separate review D61933 for a fix which is required for this patch.
Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse
Reviewed By: hfinkel, jmorse
Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D60831
llvm-svn: 363046
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel
Reviewed By: hfinkel
Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D60831
llvm-svn: 360162
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)
Patch by Chang Lin (clin1)
Differential Revision: https://reviews.llvm.org/D53876
llvm-svn: 346810
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
llvm-svn: 331841
This patch was temporarily reverted because it has exposed bug 37229 on
PowerPC platform. The bug is unrelated to the patch and was just a general
bug in the optimization done for PowerPC platform only. The bug was fixed
by the patch rL331410.
This patch returns the disabled commit since the bug was fixed.
llvm-svn: 331427
This reverts commit 023c8be90980e0180766196cba86f81608b35d38.
This patch triggers miscompile of zlib on PowerPC platform. Most likely it is
caused by some pre-backend PPC-specific pass, but we don't clearly know the
reason yet. So we temporally revert this patch with intention to return it
once the problem is resolved. See bug 37229 for details.
llvm-svn: 330893
In the function `simplifyOneLoop` we optimistically assume that changes in the
inner loop only affect this very loop and have no impact on its parents. In fact,
after rL329047 has been merged, we can now calculate exit counts for outer
loops which may depend on inner loops. Thus, we need to invalidate all parents
when we do something to a loop.
There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert
`SE->verify()` check in the end of this funciton, it fails on a bunch of existing
test, in particular:
LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll
LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll
LLVM :: Transforms/LoopUnroll/peel-loop.ll
LLVM :: Transforms/LoopUnroll/peel-loop2.ll
Note that previously we have fixed issues of this variety, see rL328483.
This patch makes this function invalidate the outermost loop properly.
Differential Revision: https://reviews.llvm.org/D45937
Reviewed By: chandlerc
llvm-svn: 330576