In most cases, SETCC lowering will be able to simplify/commute the comparison by adjusting the constant.
TODO: We still need to adjust ExtraCost based on CostKind
Fixes#80122
Follow up to #79476 - that patch added a call to hoistLockstepIdenticalDPValues
which hoists identical DPValues in lockstep, matching dbg intrinsic hoisting
behaviour. The code deleted in this patch, which unconditionally hoists
DPValues, should have been deleted in that patch.
Update test with --try-experimental-debuginfo-iterators to check the behaviour.
Follow up to #79476 - that change introduces a call to
hoistLockstepIdenticalDPValues.
Hoist DPValues attached to each instruction being considered for hoisting if
they are identical in lock-step. This includes the final instructions which
are considered but not hoisted, because the corresponding dbg.values would
appear before those instruction and thus hoisted if identical.
Identical debug records hoisted:
llvm/test/Transforms/SimplifyCFG/hoist-dbgvalue.ll
Non-identical debug records not hoisted:
llvm/test/Transforms/SimplifyCFG/X86/pr39187-g.ll
Debug records attached to first not-hoisted instructions are hoisted:
llvm/test/Transforms/SimplifyCFG/hoist-dbgvalue-inlined.ll
Debugify is extremely useful as a testing and debugging tool, and a good
number of LLVM-IR transform tests use it. We need it to support "new"
non-instruction debug-info to get test coverage, but it's not important
enough to completely convert right now (and it'd be a large
undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and
exit of the pass, which gives us the functionality without any further
work. The cost is compile-time, but again this is only happening during
tests.
Tested by: the large set of debugify tests enabled here. Note the
InstCombine test (cast-mul-select.ll) that hasn't been fully enabled:
this is because there's a debug-info sinking piece of code there that
hasn't been instrumented.
Remove the following warning.
```
WARNING: Change IR value name 'tmp4' or use --prefix-filecheck-ir-name to prevent possible conflict with scripted FileCheck name.
```
This reverts commit 20f0c68fd83a0147a8ec1722bd2e848180610288.
https://reviews.llvm.org/D153966#4464594 reports an optimization
regression in Rust.
Additionally this change has caused an unexpected 0.3% compile-time
regression.
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
Don't merge invokes if this replaces constant operands with phis
in a place where this is not legal.
This also disallows converting operand bundles from constant to
non-constant, in line with the restriction we use in other
transforms.
Fixes https://github.com/llvm/llvm-project/issues/61265.
Differential Revision: https://reviews.llvm.org/D146723
SimplifyCFG currently drops !nontemporal metadata when sinking
common instructions. With this change, SimplifyCFG and similar
transforms will preserve !nontemporal metadata as long as it is
set on both original instructions.
Differential Revision: https://reviews.llvm.org/D144298
The bool is in the wrong place and might get implicitly converted from
the previous second argument - a pointer. Thinking about it more,
it's not really the best place for that functionality anyways,
only a single caller needs that.
This reverts commit 3c5b1f2d94d021005ce3769a4402d4a4ae843989.
We really can't recover that knowledge, and `nounwind` knowledge,
(and not just a lack of the unwind edge, aka `call` instead of `invoke`),
is e.g. part of the reasoning in e.g. `mayHaveSideEffects()`.
Note that this is call-site-specific knowledge,
just because some callsite had an `unreachable`
unwind edge, does not mean that all will.
When larger integer types are natively supported simplifycfg will use an
inline constant instead of a global variable for this transform. I noticed
this while trying to automatically infer the datalayout from the target
triple in opt if it is not explicitly specified. Since the x86_64
datalayout includes "n8:16:32:64", this test started failing.
While touching this file also change i128 to i64 in the first test since
this was intended behaviour in the original commit.
Reviewed By: spatel, fhahn
Differential Revision: https://reviews.llvm.org/D141055
One of these two changes is exposing (or causing) some more miscompiles.
A reproducer is in progress, so reverting until resolved.
This reverts commit 428f36401b1b695fd501ebfdc8773bed8ced8d4e.
This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17,
and returns commit 1bd0b82e508d049efdb07f4f8a342f35818df341.
The miscompile was in InstCombine, and it has been addressed.
This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37
Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb
Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts
Differential Revision: https://reviews.llvm.org/D139275
This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37
Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb
Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts
Differential Revision: https://reviews.llvm.org/D139275
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
This reverts commit 4e545bdb355a470d601e9bb7f7b2693c99e61a3e.
The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.
The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.
So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.
Fixes https://github.com/llvm/llvm-project/issues/56203.
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.
To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D124608
SimplifyCFG implements basic jump threading, if a branch is
performed on a phi node with constant operands. However,
InstCombine canonicalizes such phis to the condition value of a
previous branch, if possible. SimplifyCFG does support this as
well, but only in the very limited case where the same condition
is used in a direct predecessor -- notably, this does not include
the common diamond pattern (i.e. two consecutive if/elses on the
same condition).
This patch extends the code to look back a limited number of
blocks to find a branch on the same value, rather than only
looking at the direct predecessor.
Fixes https://github.com/llvm/llvm-project/issues/54980.
Differential Revision: https://reviews.llvm.org/D124159
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.
LoopInfo may assert if this is not the case.
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
While nowadays SimplifyCFG knows how to hoist code from then-else blocks,
sink code from unconditional predecessors, and even promote the latter
by tail-merging `ret`/`resume` function terminators, that isn't everything.
While i (& others) have been trying to deal with merging/sinking `unreachable`,
apparently perhaps the more impactful remaining problem is merging the `throw`
calls.
If we start at the `landingpad`, all the predecessors are unwind edges of `invoke`s,
and in some cases some of the `invoke`s are mergeable.
```
/// This is a weird mix of hoisting and sinking. Visually, it goes from:
/// [...] [...]
/// | |
/// [invoke0] [invoke1]
/// / \ / \
/// [cont0] [landingpad] [cont1]
/// to:
/// [...] [...]
/// \ /
/// [invoke]
/// / \
/// [cont] [landingpad]
```
This simplifies the IR/CFG, at the cost of debug info and extra PHI nodes.
Note that we don't require for *all* the `invokes` of the `landingpad`
to be mergeable, they can form more than a single set, we gracefully handle that.
For now, i completely disallowed normal destination, PHI nodes and indirect invokes
but that can be supported.
Out of all the CTMark projects, only 7zip is C++, so there isn't much impact:
https://llvm-compile-time-tracker.com/compare.php?from=ba8eb31bd9542828f6424e15a3014f80f14522c8&to=722fc871c84f14157d45c2159bc9c8c7e2825785&stat=size-total
... but there it currently causes size-total decrease.
Differential Revision: https://reviews.llvm.org/D117805