This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
When speculating an instruction in `InstCombinerImpl::FoldOpIntoSelect`,
the call may result in undefined behavior. This patch drops all
UB-implying attrs/metadata to fix this.
Fixes#85536.
This patch adds the support for and/or in `getFreelyInvertedImpl` using
DeMorgan's Law:
```
(~(A | B)) -> (~A & ~B)
(~(A & B)) -> (~A | ~B)
```
Alive2: https://alive2.llvm.org/ce/z/Uig8-j
This patch changes DPValue::filter to be a non-member method
filterDbgVars. There are two reasons for this: firstly, the name of
DPValue is about to change to DbgVariableRecord, which will result in
every `for` loop that uses DPValue::filter to require a line break. This
is a small thing, but it makes the rename patch more difficult to
review, and is just generally more awkward for what is a fairly common
loop. Secondly, the intent is to later break up the DPValue class into
subclasses, at which point it would be better to have a non-member
function that allows template arguments for the cases we want to filter
with greater specificity.
This patch canonicalizes `extractvalue (select Cond, TV, FV)` into
`select Cond, (extractvalue TV), (extractvalue FV)`. The latter form may
enable more optimizations.
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.
This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
Instead of taking the sign of the cast operation as the required since
for the transform, only force a sign if an operation is maybe
negative.
This gives us more flexability when checking if the floats are safely
converable to integers.
Closes#84389
The full fold is one of the following:
1) `(fp_binop ({s|u}itofp x), ({s|u}itofp y))`
-> `({s|u}itofp (int_binop x, y))`
2) `(fp_binop ({s|u}itofp x), FpC)`
-> `({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))`
And support the following binops:
`fmul` -> `mul`
`fadd` -> `add`
`fsub` -> `sub`
Proofs: https://alive2.llvm.org/ce/z/zuacA8
The proofs timeout, so they must be reproduced locally.
Closes#82555
Instructions in unreachable basic blocks are removed, but terminators
are not. In this case, even instructions that are only referenced by
a terminator, such as a return instruction, cannot be processed
properly.
This patch changes the operand of a return instruction in an
unreachable basic block to poison if it refers to the instruction,
allowing the instruction to be properly processed.
Fixes#65107.
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
3. Enable dbg.label conversion and add support to passes.
Patches 1 and 2 are NFC.
In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
This reverts commit ef388334ee5a3584255b9ef5b3fefdb244fa3fd7.
The referenced issue violates the spec for finite-only math only by
using a return value for a constant infinity. If the interpretation
is results and arguments cannot violate nofpclass, then any
std::numeric_limits<T>::infinity() result is invalid under
-ffinite-math-only. Without this interpretation the utility of
nofpclass is slashed.
This patch does the following folds if only the overflow result is used:
```
extractvalue (umul.with.overflow iN X, X), 1 -> icmp ugt X, 2^(N/2)-1
```
Alive2: https://alive2.llvm.org/ce/z/a8yPC6
In instcombine, when we sink an instruction into a successor block, we try
to clone and salvage all the variable assignments that use that Value. This
is a behaviour that's (IMO) flawed, but there are important use cases where
we want to avoid regressions, thus we're implementing this for the
non-instruction debug-info representation.
This patch refactors the dbg.value sinking code into it's own function, and
installs a parallel implementation for DPValues, the non-instruction
debug-info container. This is mostly identical to the dbg.value
implementation, except that we don't have an easy-to-access ordering
between DPValues, and have to jump through extra hoops to establish one in
the (rare) cases where that ordering is required.
The test added represents a common use-case in LLVM where these behaviours
are important: a loop has been completely optimised away, leaving several
dbg.values in a row referring to an instruction that's going to sink. The
dbg.values should sink in both dbg.value and RemoveDIs mode, and
additionally only the last assignment should sink.
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.
This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.
The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.
Fixes https://github.com/llvm/llvm-project/issues/69841.
With the soon-to-land new-debug-info storage model, it's going to be
important to use iterators for instruction insertion rather than
instruction pointers. This (single line in instcombine) is the last place
that trips up our internal testing for debug-info, where we insert a PHI
and it should be using an iterator.
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.
This patch adds fold for the following cases:
`(add/sub/disjoint_or C, (ctpop (not x))`
-> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
-> `(cmp swapped_pred C', (ctpop x))`
Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.
Proofs: https://alive2.llvm.org/ce/z/qUgfF3Closes#77859
This patch folds `switch(zext/sext(X))` into `switch(X)`.
The original motivation of this patch is to optimize a pattern found in
cvc5. For example:
```
%bf.load.i = load i16, ptr %d_kind.i, align 8
%bf.clear.i = and i16 %bf.load.i, 1023
%bf.cast.i = zext nneg i16 %bf.clear.i to i32
switch i32 %bf.cast.i, label %if.else [
i32 335, label %if.then
i32 303, label %if.then
]
if.then: ; preds = %entry, %entry
%d_children.i.i = getelementptr inbounds %"class.cvc5::internal::expr::NodeValue", ptr %0, i64 0, i32 3
%cmp.i.i.i.i.i = icmp eq i16 %bf.clear.i, 1023
%cond.i.i.i.i.i = select i1 %cmp.i.i.i.i.i, i32 -1, i32 %bf.cast.i
```
`%cmp.i.i.i.i.i` always evaluates to false because `%bf.clear.i` can
only be 335 or 303.
Folding `switch i32 %bf.cast.i` to `switch i16 %bf.clear.i` will help
`CVP` to handle this case.
See also
https://github.com/llvm/llvm-project/pull/76928#issuecomment-1877055722.
Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=7954c57124b495fbdc73674d71f2e366e4afe522&to=502b13ed34e561d995ae1f724cf06d20008bd86f&stat=instructions:u
|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.03%|+0.06%|+0.07%|+0.00%|-0.02%|-0.03%|+0.02%|
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.
Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
7d7001b2cb
canonicalizes `(gep i8, X, (ptrtoint Y) - (ptrtoint X))` into `bitcast
Y` iff `X` and `Y` have the same underlying object.
I find that the result of this pattern is usually used as an operand of
an icmp in some real-world applications. I think we can do the
canonicalization if the result is only used by icmps/ptrtoints.
Alive2: https://alive2.llvm.org/ce/z/j4-HJZ
This patch canonicalize `gep T* X, V / sizeof(T)` to `gep i8* X, V`.
Alive2: https://alive2.llvm.org/ce/z/7XGjiB
As this pattern has been handled by the backends, the motivation of this
patch is to reduce the ref count of sdiv, which will enable more
optimizations.
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.
DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.
The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.
This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.
Fixes https://github.com/llvm/llvm-project/issues/74242.
My initial patch contained a typo, resulting in the wrong value
being checked for non-negativeness.
-----
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.
Proof: https://alive2.llvm.org/ce/z/Ahg4ri
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.
Proof: https://alive2.llvm.org/ce/z/Ahg4ri
This patch updates the last few places in LLVM using findDbgValues that
don't also collect and handle DPValue objects. This largely involves
instcombine and mem2reg changes, and are largely mechanical, calling
existing utilities on collections of DPValues instead of just
DbgValuesInsts.
A variety of tests have had RemoveDIs RUN lines added to them to cover
these behaviours. We have some technical debt of the instcombine sinking
code for DPValues not being implemented yet, so I've left FIXME stubs
indicating that we intend to cover tests with RemoveDIs but haven't yet.
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.
This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.
Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its
fairly easy to either 1) cause regressions or 2) infinite loops
if the folds we have for `Not(Op)` ever de-sync with the cases we
know are freely invertible.
This patch adds `getFreeInverted` which is able to build the free
inverted op along with check for free inversion to alleviate this
problem.
It might seem obvious, but it's not a good idea to convert a
debug-intrinsic instruction into an UnreachableInst, as this means
things operate differently with and without the -g option. However this
can happen due to the "mutate the next instruction" API calls we make.
With RemoveDIs eliminating debug intrinsics, this behaviour is at risk
of changing, hence this patch ensures we only ever mutate the next _non_
debuginfo instruction into an Unreachable.
The tests instrumented with the --try... flag all exercise this, I've
added some metadata to a SCCP test to ensure it's exercised.
This patch re-implements a variety of debug-info maintenence functions
to use DPValues instead of DbgValueInst's: supporting the "new"
non-intrinsic representation of debug-info. As per [0], we need to have
parallel implementations of various utilities for a time, and these are
the most fundamental utilities used throughout the compiler.
I've added --try-experimental-debuginfo-iterators to a variety of RUN
lines: this is a flag that turns on "new debug-info" if it's built into
LLVM, and not otherwise. This should ensure that we have the same
behaviour for the same IR inputs, but using a different internal
representation. For the most part these changes affect SROA/Mem2Reg
promotion of dbg.declares into dbg.value intrinsics (now DPValues),
we're leaving dbg.declares as instructions until later in the day.
There's also some salvaging changes made.
I believe the tests that I've added cover almost all the code being
updated here. The only thing I'm not confident about is SimplifyCFG,
which calls rewriteDebugUsers down a variety of code paths. Those
changes can't immediately get full coverage as an additional patch is
needed that updates handling of Unreachable instructions, will upload
that shortly.
[0]
https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939/9