Conservatively return unknown in this degenerate case. This is
hard to hit in practice, because such phis are usually optimized
away before they reach a getObjectSize() call.
Fixes https://github.com/llvm/llvm-project/issues/63013.
I removed the conflict check from computeKnownBitsFromShiftOperator()
in D150648 assuming that this is now handled on the KnownBits side.
However, the nsw handling is still inside ValueTracking, so we
still need to handle conflicts there. Restore the check closer to
where it is relevant.
Fixes https://github.com/llvm/llvm-project/issues/62908.
Since D141386 has changed the return value of !range from IUB to poison,
metadata !range shouldn't be preserved even if K dominates J.
If this patch was accepted, I plan to adjust metadata !nonnull as well.
BTW, I found that metadata !noundef is not handled in combineMetadata,
is this intentional?
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142687
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.
Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.
Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D136827
This patch causes debug value intrinsics outside of cloned blocks in the
Jump Threading pass to correctly point towards any derived values. If it cannot,
it kills them.
Reviewed By: probinson, StephenTozer
Differential Revision: https://reviews.llvm.org/D140404
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.
Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.
Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D136827
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.
This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.
The first version was reverted due to assert caused by i32 overflow,
fixed in this version.
Patch by Roman Paukner!
Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
This reverts commit 957952dbf2f34ed552e8e1f8c35eed17eee2ea38.
Addition in the newly added code can overflow. As a result, the
constructor of `BranchProbability()` can trigger an assertion. See
the discussion on https://reviews.llvm.org/D138132 for more details.
This is a patch to fix duplicated dbg.values in the JumpThreading pass not
pointing towards their local value, and instead towards the variable in the
original block.
JumpThreadingPass::cloneInstructions is the changed function to target metadata
as well as normal cloned values.
Reviewed By: jmorse, StephenTozer
Differential Revision: https://reviews.llvm.org/D140006
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.
This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.
Patch by Roman Paukner!
Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.
Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.
Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.
Differential Revision: https://reviews.llvm.org/D136827
Second patch in the series to remove legacy PM and
associated -enable-new-pm=0 flag targets pass that
has not been ported to new PM - PruneEH.
Discussion about this can be found in D44415.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D134686
Use getPredicateOnEdge method if value is a non-local
compare-with-a-constant instruction, that can give more precise
results than getConstantOnEdge.
Differential Revision: https://reviews.llvm.org/D131956
Since D129288, callbr is allowed to have duplicate successors. This patch removes a limitation which prevents optimizations from actually producing such callbrs.
This is probably the riskiest of all the recent callbr changes, because code with incorrect assumptions might be lurking somewhere. I fixed the one case I encountered ahead of time in 8201e3ef5c.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D129997
Originally landed as
commit 08860f525a23 ("[Local] Allow creating callbr with duplicate successors")
Reverted in
commit 1cf6b93df168 ("Revert "[Local] Allow creating callbr with duplicate successors"")
This reverts commit 08860f525a2363ccd697ebb3ff59769e37b1be21.
Crashes during PPC64LE linux kernel builds as reported by @nathanchance.
https://reviews.llvm.org/D129997#3663632
Since D129288, callbr is allowed to have duplicate successors. This
patch removes a limitation which prevents optimizations from actually
producing such callbrs.
Differential Revision: https://reviews.llvm.org/D129997
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:
; Before:
%res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "", "=r,r,!i"(i8* %x)
to label %asm.fallthrough [label %foo]
The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:
* Allow unrolling/peeling/rotation of callbr, or any other
clone-based optimizations
(https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
(https://github.com/llvm/llvm-project/issues/45248)
This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.
Differential Revision: https://reviews.llvm.org/D129288
Since we can't change the destination of indirectbr, so when
encounter indirectbr as PredPredBB terminator, we should pass it.
Differential Revision: https://reviews.llvm.org/D129193
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.
The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.
I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.
Differential Revision: https://reviews.llvm.org/D129205
All callers pass true.
select-unfold-freeze.ll is now a subset of select.ll so delete it.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D126501
JumpThreading may convert selects into branch instructions,
in which case the condition needs to be frozen (as branch on
poison is immediate undefined behavior, unlike select on poison).
The necessary code for this is already in place, this just enables
the option.
Differential Revision: https://reviews.llvm.org/D125869
This code is valid for any icmp, so we can safely look through a
freeze when trying to find one.
A caveat here is that replaceFoldableUses() may not end up replacing
any uses in this case. It might make sense to use the freeze as the
context instruction (rather than the terminator) if there is a
freeze, to ensure that it always gets folded. This would require
some changes to how replaceFoldedUses() works though, as it
currently assumes that the value is valid at the end of the block.
This patch makes JumpThreading's ProcessImpliedCondition deal with frozen
conditions.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84941
The test was a part of the revision D81499 and should have been
added with commit 707836ed4ed.
Reviewed By: yamauchi, wenlei
Differential Revision: https://reviews.llvm.org/D81499
After e734e8286b4b521d829aaddb6d1cbbd264953625, it is possible to end up in
a situation where an `indirectbr` is fed by a cast, which is in turn fed by
an operation which only produces integers.
`indirectbr` expects a block address, however these operations can't produce
that.
There were several asserts in `computeValueKnownInPredecessorsImpl` which check
that we're not looking for a block address if we're walking through something
which can never produce one.
Since it's now possible to hit these asserts, this changes them into actual
checks which return false if `Preference` is not `WantInteger`.
This adds a testcase which verifies that we don't crash anymore in these
situations.
Differential Revision: https://reviews.llvm.org/D99814
In D115311, we're looking to modify clang to emit i constraints rather
than X constraints for callbr's indirect destinations. Prior to doing
so, update all of the existing tests in llvm/ to match.
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D115410
This solves a problem with non-deterministic output from opt due
to not performing dominator tree updates in a deterministic order.
The problem that was analysed indicated that JumpThreading was using
the DomTreeUpdater via llvm::MergeBasicBlockIntoOnlyPred. When
preparing the list of updates to send to DomTreeUpdater::applyUpdates
we iterated over a SmallPtrSet, which didn't give a well-defined
order of updates to perform.
The added domtree-updates.ll test case is an example that would
result in non-deterministic printouts of the domtree. Semantically
those domtree:s are equivalent, but it show the fact that when we
use the domtree iterator the order in which nodes are visited depend
on the order in which dominator tree updates are performed.
Since some passes (at least EarlyCSE) are iterating over nodes in the
dominator tree in a similar fashion as the domtree printer, then the
order in which transforms are applied by such passes, transitively,
also depend on the order in which dominator tree updates are
performed. And taking EarlyCSE as an example the end result could be
different depending on in which order the transforms are applied.
Reviewed By: nikic, kuhar
Differential Revision: https://reviews.llvm.org/D110292
It seems the crashes we saw wasn't caused by this (see comments on the review).
> This is basically D108837 but for jump threading. Free instructions
> should be ignored for the threading decision. JumpThreading already
> skips some free instructions (like pointer bitcasts), but does not
> skip various free intrinsics -- in fact, it currently gives them a
> fairly large cost of 2.
>
> Differential Revision: https://reviews.llvm.org/D110290
This reverts commit 4604695d7c20e72b551a1a5224f3de877cb41bd3.
It caused compiler crashes, see comment on the code review for repro.
> This is basically D108837 but for jump threading. Free instructions
> should be ignored for the threading decision. JumpThreading already
> skips some free instructions (like pointer bitcasts), but does not
> skip various free intrinsics -- in fact, it currently gives them a
> fairly large cost of 2.
>
> Differential Revision: https://reviews.llvm.org/D110290
This reverts commit 1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff.
This is basically D108837 but for jump threading. Free instructions
should be ignored for the threading decision. JumpThreading already
skips some free instructions (like pointer bitcasts), but does not
skip various free intrinsics -- in fact, it currently gives them a
fairly large cost of 2.
Differential Revision: https://reviews.llvm.org/D110290
There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.
This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR.
This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.
Fix a bug introduced by f6f6f6375d1a4bced8a6e79a78726ab32b8dd879.
Now for empty PHIs, instead of crashing on assert(hasVal()) in
Optional's internals, we'll return NoAlias, as we did before that patch.
Differential Revision: https://reviews.llvm.org/D103831
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.
This is a preliminary commit for https://reviews.llvm.org/D99080