We have now seen two miscompiles because of widening widenable
conditions at incorrect IR points and thereby changing a branch's loop
invariant condition to a loop-varying one (see PR60234 and PR61963).
This patch adds asserts in common guard utilities that we use for
widening to proactively catch these bugs in future.
Note that these asserts will not fire if we were to sink a widenable
condition from out of a loop into a loop (that's also incorrect for the
same reason as above).
Tested this without the fix for PR60234 (guard widening miscompile) and
confirmed the assert fires.
WARNING: Sometimes, the assert can fire if we failed to hoist the
invariant condition out of the loop. This is a pass-ordering issue or a
limitation in LICM, which would need an investigation. See details in
review.
Differential Revision: https://reviews.llvm.org/D147752
Re-write the code to avoid iteration over users of
constants and global values.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D147450
When we replace poison with freeze poison it might appear
that user of poison is a constant (for example vector constant).
In this case we will get that constant will get non-constant operand.
Moreover replacing poison and GlobalValue everywhere in module seems
to be overkill. So the solution will be just make a replacement
only in instructions we visited (contributing to hoisted condition).
Moreover if user of posion is constant, this constant also should need
a freeze and it does not make sense to replace poison with frozen version,
just freeze another constant.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D147429
Non-determenism is fixed.
Guard widening optimization is able to move the condition from one
guard to the previous one. As a result if the condition is poison
and orginal second guard is never executed but the first one does,
we introduce undefined behavior which was not observed in original
program.
To resolve the issue we must freeze the condition we are moving.
However optimization itself does not know how to work with freeze.
Additionally optimization is written in incremental way.
For example we have three guards
G1(base + 8 < L)
G2(base + 16 < L)
G3(base + 24 < L)
On the first step GW will combine G1 and G2 as
G1(base + 8 < L && freeze(base + 16 < L))
G2(true)
G3(base + 24 < L)
while combining G1 and G3 base appears to be different.
To keep optimization enabled after freezing the moving condition, the
freeze instruction is pushed as much as possible and later all uses
of freezed values are replaced with frozen version.
This is similar what instruction combining does but more aggressevely.
This reverts commit f4b2360cecd4c92e85bccb1443f2ef425fc6a77b.
The patch has no specific order in adding freeze instruction in the
entry basic block. It causes failure of CHECK like unit tests.
Guard widening optimization is able to move the condition from one
guard to the previous one. As a result if the condition is poison
and orginal second guard is never executed but the first one does,
we introduce undefined behavior which was not observed in original
program.
To resolve the issue we must freeze the condition we are moving.
However optimization itself does not know how to work with freeze.
Additionally optimization is written in incremental way.
For example we have three guards
G1(base + 8 < L)
G2(base + 16 < L)
G3(base + 24 < L)
On the first step GW will combine G1 and G2 as
G1(base + 8 < L && freeze(base + 16 < L))
G2(true)
G3(base + 24 < L)
while combining G1 and G3 base appears to be different.
To keep optimization enabled after freezing the moving condition, the
freeze instruction is pushed as much as possible and later all uses
of freezed values are replaced with frozen version.
This is similar what instruction combining does but more aggressevely.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D146699
The initial version was reverted because it looped infinitely if the likely successor
isn't properly dominated by the predecessor. In practice it means that we went up the
CFG through backedge and looped infinitely.
I also added some paranoid assertion checks to make sure that every other invariant
holds. I also found a hypothetical situation when we may go past the dominated block
while following the likely successors (it means that in fact the dominated block is
dynamically not reachable from dominating block) and explicitly prohibited this, though
I don't have a motivating test showing that it's a real problem.
https://reviews.llvm.org/D146276
This reverts commit 8d2885c2ef98b81927c1f816691ec4e77cfc7f3f.
I accidentally introduced an infinite loop in this patch, will return when this is fixed.
There is a piece of logic in GuardWidening which is very limited, and it happens
to ignore implicit control flow, therefore it "works fine" with guards expressed as
intrinsic calls. However, when the guards are represented as branches, its limitations
create a lot of trouble.
The intent here is to make sure that we do not widen code across complex CFG,
so that it can end up being in hotter code than it used to be. The old logic was limited
to unique immediate successor and that's it.
This patch changes the logic there to work the following way: when we need to check
if we can widen from `DominatedBlock` into `DominatingBlock`, we first try to find the
lowest (by CFG) transitive successor of `DominatingBlock` which is guaranteed to not
be significantly colder than the `DominatingBlock`. It means that every time we move
to either:
- Unique successor of the current block, if it only has one successor;
- The only taken successor, if the current block ends with `br(const)`;
- If one of successors ends with deopt, another one is taken;
If the lowest block we can find this way is the `DominatedBlock`, then it is safe to assume
that this widening won't move the code into a hotter location.
I did not touch the existing check with PDT. It looks fishy to me (post-dominance doesn't
really guarantee anything about hotness), but let's keep it as is and maybe fix later.
With this patch, Guard Widening can widen explicitly expressed branches across more than one
dominating guard if it's profitable.
Differential Revision: https://reviews.llvm.org/D146276
Reviewed By: skatkov
This fixes a possible issue when we could hoist an instruction up to a widenable
condition intrinsic call without making sure its operands are available at
hoisiting point.
Recently insertion point finding algorithm changed a bit, so this availability
check became necessary.
Verifier would crash after we handled the following special case:
L >u C0 && L >u C1 -> L >u max(C0, C1),
Previously we would insert the new condition right before the widenable condition
branch where all L operands were available.
Now we may choose the widenable condition intrinsic call as insertion point and it may
happen so that the L operands are computed after the call, so we have to make sure that
L operands are available at the point we want to insert it.
Differential Revision: https://reviews.llvm.org/D144944
There was a crash because there was inconsistency between 'isAvailableAt'
and 'makeAvailableAt' queries. 'makeAvailableAt' is called on conditions
of both guards (dominating and dominated) and 'isAvailableAt' is called
only for dominated guard's condition. Before this patch, it didn't matter
because insertion point always matched the dominating guard. Now, because
they are different, this inconsistency leads to incorrect transforms which
are caught by assert.
The fix is to check 'isAvailableAt' for both conditions.
Differential Revision: https://reviews.llvm.org/D142693
When guards are represented as widenable branches, there is a tricky
situation when the branch stays in loop but widenable condition doesn't.
It means that the widenable condition is loop-invariant, and some other
optimizations could have done changes using this fact.
If widening is allowed to create widened condition inside this loop,
and join the loop-invariant wc with some non-invariant facts, it can
cause miscompile. See example of this at https://github.com/llvm/llvm-project/issues/60234.
The solution is to adjust the point of generationg the wide condition,
and therefore of hoisting all related parts there. It should not be before
the branch, but before the widenable_condition call. The fact that `wc()`
and the wide condition are in the same block guarantees that they will
not violate the invariance property for any loop.
Differential Revision: https://reviews.llvm.org/D142693
Reviewed By: apilipenko
c2eccc6 introduced a call to etHasNoUnsignedWrap which implicitly assumes that Inst is a OverflowingBinaryOperator. This is frequently untrue, but was not caught because cast<Ty>(X) has been broken, see https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context.
I considered reverting this, but since doing so re-introduces a nasty miscompile of its own, I decided to fix forward instead.
I'll note that this is a particularly nasty form of the cast<Ty>(X) issue. Because the cast was succeeding unexpected, we were writing data to instructions which weren't OBOs. This could result in near arbitrary data or memory corruption. I'm a bit shocked that the sanitizers didn't find this TBH.
When we hoist instructions over guard we must clear flags due to these flags
might be implied using this guard, so they make sense only after the guard.
As an example of the bug due to current behavior.
L is known to be in range say [0, 100)
c1 = x u< L
guard (c1)
x1 = add x, 1
c2 = x1 u< L
guard(c2)
basing on guard(c1) we can say that x1 = add nuw nsw x, 1
after guard widening we get
c1 = x u< L
x1 = add nuw nsw x, 1
c2 = x1 u< L
c = and c1, c2
guard(c)
now, basing on fact that x + 1 < L and x >= 0 due to x + 1 is nuw
we can prove that x + 1 u< L implies that x u< L, so we can just remove c1
x1 = add nuw nsw x, 1
c2 = x1 u< L
guard(c2)
But that is not correct due to we will pass x == -1 value.
Reviewed By: mkazantsev
Subscribers: llvm-commits, nikic
Differential Revision: https://reviews.llvm.org/D126354
For some optimizations on comparisons it's necessary that the
union/intersect is exact and not a superset. Add methods that
return Optional<ConstantRange> only if the result is exact.
For the sake of simplicity this is implemented by comparing
the subset and superset approximations for now, but it should be
possible to do this more directly, as unionWith() and intersectWith()
already distinguish the cases where the result is imprecise for the
preferred range type functionality.
As reported on https://bugs.llvm.org/show_bug.cgi?id=51020, the
guard widening pass doesn't preserve MemorySSA, so it can no
longer be scheduled in the same loop pass manager as LICM. However,
the loop-schedule.ll test indicates that this is supposed to work.
Fix this by preserving MemorySSA if available, as this seems to be
trivial in this case (we only need to drop the memory access for
the removed guards).
Differential Revision: https://reviews.llvm.org/D108386
As a reminder, a "widenable branch" is the pattern "br i1 (and i1 X, WC()), label %taken, label %untaken" where "WC" is the widenable condition intrinsics. The semantics of such a branch (derived from the semantics of WC) is that a new condition can be added into the condition arbitrarily without violating legality.
Broaden the definition in two ways:
Allow swapped operands to the br (and X, WC()) form
Allow widenable branch w/trivial condition (i.e. true) which takes form of br i1 WC()
The former is just general robustness (e.g. for X = non-instruction this is what instcombine produces). The later is specifically important as partial unswitching of a widenable range check produces exactly this form above the loop.
Differential Revision: https://reviews.llvm.org/D70502
This code has never been enabled. While it is tested, it's complicating some refactoring. If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.
With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often). We've got a couple o transforms which leverage this. This patch just cleans up the API a bit.
This is prep work for generalizing our definition of a widenable branch slightly. At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is. That clearly needs addressed, so first, let's centralize the code in one place.
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
We had a subtle, but nasty bug in our definition of a widenable branch, and thus in the transforms which used that utility. Specifically, we returned true for any branch which included a widenable condition within it's condition, regardless of whether that widenable condition also had other uses.
The problem is that the result of the WC() call is defined to be one particular value. As such, all users must agree as to what that value is. If we widen a branch without also updating *all other users* of the WC in the same way, we have broken the required semantics.
Most of the textual diff is updating existing transforms not to leave dead uses hanging around. They're largely NFC as the dead instructions would be immediately deleted by other passes. The reason to make these changes is so that the transforms preserve the widenable branch form.
In practice, we don't get bitten by this only because it isn't profitable to CSE WC() calls and the lowering pass from guards uses distinct WC calls per branch.
Differential Revision: https://reviews.llvm.org/D69916
This patch adds support of guards expressed in explicit form via
`widenable_condition` in Guard Widening pass.
Differential Revision: https://reviews.llvm.org/D56075
Reviewed By: reames
llvm-svn: 353932
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
rL340921 has been reverted by rL340923 due to linkage dependency
from Transform/Utils to Analysis which is not allowed. In this patch
this has been fixed, a new utility function moved to Analysis.
Differential Revision: https://reviews.llvm.org/D51152
llvm-svn: 341014
We have multiple places in code where we try to identify whether or not
some instruction is a guard. This patch factors out this logic into a separate
utility function which works uniformly in all places.
Differential Revision: https://reviews.llvm.org/D51152
Reviewed By: fedor.sergeev
llvm-svn: 340921
Guard widening should not spend efforts on dealing with guards with trivial true/false conditions.
Such guards can easily be eliminated by any further cleanup pass like instcombine. However we
should not unconditionally delete them because it may be profitable to widen other conditions
into such guards.
Differential Revision: https://reviews.llvm.org/D50247
Reviewed By: fedor.sergeev
llvm-svn: 340381
This is a second part of D49974 that handles widening of conditional branches that
have very likely `false` branch.
Differential Revision: https://reviews.llvm.org/D50040
Reviewed By: reames
llvm-svn: 339537